AMD and MLPerf Endpoints: Leading the Next Era of GenAI Benchmarking

Mar 19, 2026

Curved teal and orange light trails with bokeh particles on a dark background, symbolizing connectivity

How AMD is helping advance open, community-driven benchmarking for real-world generative AI services

The landscape of generative AI performance evaluation is undergoing an important shift. The focus has changed from just hardware performance to evaluation from the perspective of a service provider. While measuring token throughput of GPUs still matters, serving those tokens to customers takes the top priority. The infrastructure is evolving too from focusing on individual frameworks to managed services delivered through endpoints. 

To keep up with those changes, MLCommons, the consortium behind the industry-standard MLPerf benchmarks, has been developing MLPerf Endpoints, a new AI inference benchmarking suite focused on genAI serving. Recognizing the importance of this initiative, AMD has been involved since day 1, helping to define workloads, rules, and infrastructure for the new way of performance evaluation. MLPerf Endpoints bring in multiple new features including API centric architecture and rolling submissions enabling benchmarking at the speed of software updates. 

That is why AMD is helping lead MLPerf Endpoints. As one of the initial five collaborators providing initial Endpoints data, we are helping shape the initiative because we believe the next generation of AI benchmarks should be open, community-governed, and grounded in real-world deployment. For AMD, leadership in AI means more than delivering strong performance. It means showing up early, contributing to the standards that matter, and advancing a benchmarking model built on transparency, broad industry participation, and long-term trust.

“AMD has long championed open standards and community-driven benchmarking. As a founding member of MLCommons and a regular participant in MLPerf Training and Inference, we believe transparent evaluation benefits the entire ecosystem. MLPerf Endpoints extends that collaborative spirit to generative AI serving, and AMD is proud to support the initiative.” – Emad Barsoum, Corporate VP, AMD.

AMD brings further credibility to MLPerf Endpoints because our commitment to openness is already visible across the AMD ROCm™ software stack. ROCm provides an open foundation for AI development and deployment, designed to support ecosystem collaboration, transparency, and broad software compatibility. AMD has carried those same principles into benchmarking as well, publishing reproducibility guides for all of our MLPerf Inference and Training submissions on an open-source ROCm stack using open-source AI frameworks. Our participation in MLPerf Endpoints is therefore a continuation of our broader commitment to open software, transparent benchmarking, and community-led progress.

MLPerf has earned industry trust because MLCommons has built it around fair, representative, and reproducible evaluation through a community-driven process. Its benchmark suites are defined by working-group communities of experts, which is a big reason MLPerf matters to buyers, builders, and the broader ecosystem: the standard is shaped through an open industry process rather than by a single vendor or publisher.

"MLPerf Endpoints measures AI performance the way customers actually experience it — through live API endpoints, with results produced continuously using the same trusted methods from MLPerf Inference. AMD helped us build this from day one, and their willingness to put real production infrastructure through the benchmark is exactly what makes MLPerf the trusted industry standard." David Kanter, co-founder of MLCommons, Head of MLPerf

That distinction is increasingly important. In a fast-moving AI landscape, the industry benefits from benchmarks that are not only technically relevant but also shaped through broad community participation. MLPerf Endpoints brings that added dimension through neutral nonprofit stewardship, formal working-group development, and a governance model that gives members a voice in how the benchmark evolves. For AMD, that community-driven approach matters because it helps create a standard the broader ecosystem can inspect, influence, and trust over time.

MLPerf Endpoints reflects the same principles AMD has championed across the AI stack: openness, transparency, and community-led progress. By measuring GenAI as it is actually consumed, through APIs and production services, it offers a more realistic view of performance than closed evaluation models that are controlled by a single organization and offer limited transparency into methodology or governance. With its API-centric framework, Pareto-style reporting, and rolling submissions, MLPerf Endpoints represents a more open and more practical path for evaluating modern GenAI systems, making AMD’s participation a natural extension of our broader commitment to open innovation and industry collaboration.

As AI becomes a service, the industry needs benchmarks that are realistic, transparent, and openly governed. That matters because benchmark methodology increasingly shapes how AI infrastructure is evaluated, shortlisted, and purchased. By helping lead MLPerf Endpoints, AMD is making a clear statement: the future of AI should be defined by both cutting-edge performance and open, community-driven standards. We look forward to continuing to help shape the benchmark and publish results on AMD Instinct™ GPUs as the initiative advances.

Share:

Article By


Contributors


SMTS Systems Design Engineer 

Related Blogs