Reprogramming Discovery: How AMD Instinct™ GPUs Are Powering the Next Wave of AI-Driven Biology

May 19, 2025

IPA Therapeutics use AMD Instinct MI300X GPUs to reprogram discovery

Contributed by IPA Therapeutics


Part 1: Embeddings for NLP in Life Sciences

This article is the first in a three-part series benchmarking AMD Instinct™ MI300X GPUs against NVIDIA’s H100 GPUs across real-world AI workloads in drug discovery. The benchmarks were conducted by ImmunoPrecise Antibodies (IPA) and its AI subsidiary BioStrand, creators of the LENSai™ platform for AI-powered biologics discovery — in collaboration with Vultr, whose high-performance cloud infrastructure enabled rapid deployment and reproducibility across hardware configurations. Together, we evaluated how these GPUs perform under the practical demands of therapeutic development: from NLP-driven target discovery to generative protein design.

At the core of LENSai lies HYFT® technology — a biological fingerprinting system that encodes conserved sequence, structure, and function into a unified index. HYFTs were built to solve a fundamental limitation in AI: its lack of native understanding of biological systems. By embedding biological logic into the fabric of computation, HYFTs give AI models the context to reason through biology, not just compute it.

Over the following three articles, we’ll explore how the MI300X GPUs perform across the LENSai tech stack: NLP-driven literature mining, creation of protein embeddings for structure-function inference, and generative design through RFdiffusion.

Through real-world benchmarks in NLP, protein embeddings, and de novo protein design, we set out to evaluate raw performance, cost efficiency, and deployment viability for modern bioinformatics pipelines.

In this first installment, we focus on Natural Language Processing (NLP)—specifically, how large language models and Retrieval-Augmented Generation (RAG) accelerate early-stage therapeutic discovery by extracting actionable insights from scientific literature. The key takeaway? AMD GPUs are not only competitive in speed but also offer substantial cost advantages—a critical factor for life science organizations scaling AI-driven platforms.

Natural Language Processing (NLP) significantly enhances therapeutic innovations by mining vast textual information effectively. NLP helps unlock latent insights from scientific literature, clinical reports, and molecular databases. NLP-driven large language models (LLMs) streamline the analysis and prediction processes essential for drug discovery, aligning with the FDA’s shift towards computational models, emphasizing safety, efficacy, and cost-efficiency.

Vector embeddings in RAG (Retrieval-Augmented Generation) systems enable knowledge-aware models to surface relevant insights based on semantics rather than phrasing. These embeddings aren’t limited to text; they support biological sequences and structures as well, enabling NLP to bridge silos in life sciences.

LENSai builds on today’s vector search capabilities and takes it further —adding a powerful semantic layer that detects sub-sentence units and extracts subject–predicate–object triples to uncover meaningful biological relationships. By capturing how targets, pathways, and compounds interact at a mechanistic level, LENSai empowers researchers to identify therapeutic targets, map disease pathways, and anticipate drug behavior with greater clarity. This depth of insight—often buried in unstructured biomedical data—can be surfaced and acted on long before wet lab experiments begin, accelerating discovery while reducing cost and risk.

Infrastructure Context

We deployed both NVIDIA H100 and AMD Instinct MI300X GPUs in a flexible, cloud-native environment, ensuring reproducible benchmarks and fair comparisons across hardware generations.

    GPU Specification  

 AMD Instinct™  GPU  

  NVIDIA H100   

    Memory Capacity 

 192 GB

 80 GB

    GPU Architecture 

 CDNA 3

 Hopper

      Compute Power 

 FP64/FP32/FP16  

 FP64/FP32/FP16  

 Deployment Mode

 Cloud-native

 Cloud-native


NLP Benchmark Results

Our retrieval-augmented generation (RAG) systems use vector embeddings of literature for contextually relevant insights. AMD demonstrated superior throughput and cost-efficiency:

Metric  

   NVIDIA H100     

  AMD Instinct™ MI300X   

Sequences/sec 

 2741.21

 3421.22

  Cost per 1M Samples  

 $2.40

 $1.46


The MI300X also exhibited enhanced stability under high concurrency workloads.

Technical Implementation: Seamless Transition to AMD GPUs

Transitioning NLP tasks to AMD GPUs via ROCm PyTorch Docker images is straightforward:

FROM rocm/pytorch:rocm6.3.1_ubuntu22.04_py3.10_pytorch

No changes are required in the Python code—PyTorch's device abstraction (torch.device("cuda")) ensures compatibility.

These NLP benchmarks illustrate how AMD Instinct™ MI300X GPUs delivers both technical and economic value in one of the most fundamental layers of AI-assisted drug discovery.

In Part 2 of our next blog, we’ll move deeper into the biological stack, exploring how protein language models and biological embeddings reshape how we understand sequences, mutations, and functional relevance in drug development.


 IPA (ImmunoPrecise Antibodies NASDAQ: IPA ) is a biotherapeutic research company that brings industry leading antibody discovery services and complex artificial intelligence technologies together — to lead its pharmaceutical partners into the era of the antibody.

Vultr is led by veterans of the managed hosting business, taking 20+ years of experience in complex hosting environments and made it their mission to simplify the cloud.

 

 

Share:

Article By


Related Blogs