Reprogramming Discovery: How AMD GPUs Are Powering the Next Wave of AI-Driven Biology
May 28, 2025

Contributed by IPA Therapeutics
Part 2: Protein Embeddings for Enhanced Biological Analysis
This is the second article in our three-part benchmarking series comparing the AMD Instinct MI300X GPUs to NVIDIA H100 GPUs across key AI workloads in drug discovery.
In Part 1, we explored NLP-based knowledge extraction for biomedical research. Now, we shift focus to the protein layer to evaluate how both GPUs handle large-scale protein language models (pLLMs) used for understanding structure, function, and mutational effects.
These benchmarks were conducted by ImmunoPrecise Antibodies (IPA) and its AI division BioStrand, developers of LENSai™, an AI-native platform that unifies sequence, structure, and functional reasoning to drive biologics discovery. All tests were run on high-performance cloud infrastructure provided by Vultr, enabling reproducible, side-by-side comparisons in a production-grade environment.
At the core of LENSai lies HYFT® technology — a biological fingerprinting system that encodes conserved sequence, structure, and function into a unified index. HYFTs were built to solve a fundamental limitation in AI: Its lack of native understanding of biological systems. By embedding biological logic into the fabric of computation, HYFTs give AI models the context to reason through biology, not just compute it.
Now, we focus on protein embeddings — the foundation for understanding molecular function, mutation effects, and binding interactions. These embeddings power everything from drug target prioritization to structural modeling. By benchmarking ESM-2 model variants and exploring how LENSai and the HYFT technology use “anchored embeddings,” we show how AMD GPUs perform in a domain where memory capacity and biological accuracy are critical.
Protein Language Models (pLLMs) decode amino acid sequences into manageable vectors, significantly enhancing biological data analysis by capturing functional and evolutionary information.
They allow researchers to ask questions such as: How similar is this sequence to known druggable targets? What structure is this unknown protein likely to adopt? How might a mutation affect function or binding? Embeddings compress this information into forms that machine learning models can reason about, enabling breakthroughs in antibody discovery, immunogenicity screening, and multi-omics interpretation.
ESM-2 Protein Language Model Benchmarks
With various model sizes, ESM-2 benchmarks tested AMD’s advantages in throughput and scaling:
Model Size |
NVIDIA H100 (seq/sec) |
AMD InstinctMI300X (seq/sec) |
Cost Reduction |
Small (35M) |
2482.41 |
3413.15 |
~44% |
Medium (650M) |
368.04 |
637.94 |
~63% |
Large (3B) |
111.19 |
178.76 |
~55% |
AMD GPUs efficiently managed larger batches, providing smoother throughput and significant cost savings.
Integrated Drug Discovery with LENSai and HYFT
LENSai employs "HYFT anchored embeddings," selectively embedding residues within conserved HYFT patterns, reducing noise and enhancing biological signal clarity.
HYFTs are biologically meaningful sub-sequence units representing conserved motifs tied to function or structure. By anchoring embedding to HYFTs, LENSai minimizes irrelevant noise and focuses only on the most functionally informative aspects of the sequence.
HYFT-based embeddings support:
- Accurate prediction of structural mutation impacts.
- Identification of conserved motifs.
- Efficient semantic search across therapeutic candidate libraries.
Technical Implementation: Seamless Transition to AMD GPUs
Implementing protein embeddings on AMD GPUs involves a single-line Dockerfile update, ensuring minimal disruption:
FROM rocm/pytorch:rocm6.3.1_ubuntu22.04_py3.10_pytorch
No major code modifications required.
Protein embeddings are the connective tissue between raw sequences and functional interpretation. As our benchmarks demonstrate, AMD MI300X provides the throughput and memory headroom needed for today’s most advanced protein models.
In the final post of this series, we’ll push the boundaries of AI-driven design itself — benchmarking RFdiffusion, a generative model capable of imagining and engineering entirely new proteins from scratch.
IPA Therapeutics (ImmunoPrecise Antibodies NASDAQ: IPA) is a biotherapeutic research company that brings industry leading antibody discovery services and complex artificial intelligence technologies together — to lead its pharmaceutical partners into the era of the antibody.
Vultr is led by veterans of the managed hosting business, taking 20+ years of experience in complex hosting environments and made it their mission to simplify the cloud.
