AMD at OCP 2025: Driving Open Ecosystems and Scalable Compute for the AI Era

Oct 14, 2025

AMD and Open Compute Project logos on dark blue banner with text “OCP Global Summit, October 13–16, 2025, San Jose, California” and green-blue abstract wave design.

Watch Mark Papermaster's OCP keynote: A fully Open and Collaborative AI Ecosystem

At the OCP Global Summit 2025, AMD is showcasing how open collaboration and open standards are reshaping the future of data center and AI infrastructure. As AI workloads grow from single racks to entire regions, the need for interoperability, scalability, and sustainability has never been greater.

The efforts by AMD, working across the broad OCP community over the past year, reflect a clear vision: open ecosystems thrive. By contributing to open hardware, software, and networking standards, AMD is helping the industry reduce friction, accelerate technology adoption, and enable choice and flexibility at scale.

Building the Foundation: A Year of Collaborative Progress

Together with hyperscalers, ODMs, and fellow OCP members, AMD has advanced multiple OCP initiatives that are already impacting how data centers are designed and deployed.

One major area of progress is platform enablement. As an early participant in the DC-MHS initiative, AMD helped pioneer a modular approach to server design that makes it easier for the industry to adopt new technologies. AMD OCP-compliant server reference designs — supporting today’s AMD EPYC™ 9005 processors and future product generations — integrate standards like DC-MHS, M-CRPS, DC-SCM to support interoperability, vendor neutrality, and scalability in modern data centers, and OCP-compliant NICs, enable faster ecosystem readiness and broader industry choice.

Security has been another critical focus. AMD co-founded Project Caliptra, an open-source hardware root of trust designed to help protect confidential computing workloads. Alongside adoption of OCP S.A.F.E. and leadership in the OCP Attestation Specification, AMD is helping devices prove their trustworthiness through cryptographic evidence — a key requirement for more secure, large-scale deployments.

Networking standards have also gained momentum. AMD is driving UALink and UltraEthernet to deliver open, multi-vendor scale-up and scale-out fabrics, enabling any CPU, any accelerator, at any scale. This work complements the broader AMD commitment to open interconnects that avoid proprietary vendor lock-in.

Beyond these efforts, AMD has:

Formed the x86 Ecosystem Advisory Group (EAG), bringing together AMD, Intel and key ecosystem partners in a shared commitment to advancing the future of the x86 platform through collaborative decision-making, standardized features, and developer-friendly innovations. Check out the AMD blog marking the one-year anniversary of the x86 EAG and highlighting the notable progress and technology milestones achieved over the past year.
Expanded its open software ecosystem through ROCm™, open-source libraries, and contributions to the Linux Foundation, PyTorch Foundation, and other open governance bodies — giving developers freedom to build without lock-in.
Led advancements in RAS (Reliability, Availability, and Serviceability) standards for OCP CPUs and GPUs, addressing large-scale data center pain points in hardware fault management, memory error handling, and debug capabilities.
Contributed to management and cooling initiatives, from the Open Boot & Management Framework to liquid cooling interoperability projects, supporting sustainable, high-performance infrastructure.

Performance and scalability are only part of the equation. AMD is committed to meeting the industry’s energy efficiency challenge through its 25x30 goal — aiming to deliver 25 times more energy efficiency in AI and HPC by 2030. To achieve this, AMD is driving innovations in cooling systems, liquid cooling interoperability, and energy efficient rack-scale designs that support the focus of OCP on operational sustainability.

Together, these efforts showcase how the contributions of AMD to OCP are enabling more scalable, secure, and sustainable compute — and why open standards are the foundation for the AI era.

AMD Showcases “Helios” Rack-Scale Platform

“Helios” Rack on Display at OCP 2025 Conference

AMD “Helios” Rack-Scale Platform on display at OCP 2025

Representing a major step forward in open, interoperable AI infrastructure, AMD showcased a static display of its “Helios,” rack scale platform for the first time in public. Developed in alignment with the new Open Rack Wide (ORW) specification, introduced by Meta, “Helios” extends the AMD open hardware philosophy from silicon to system to rack.

The AMD “Helios” rack scale platform integrates AMD Instinct GPUs, EPYC CPUs, and open fabrics to offer the industry a flexible, high-performance platform that extends AMD leadership in AI and high-performance computing and provides the foundation to deliver the open, scalable infrastructure that will power the world’s growing AI demands. Check out the AMD "Helios" blog for more information.

Expanding Networking Choice: Upcoming AI NIC Innovations

Continuing the past year of progress on OCP initiatives, the AMD Pensando™ Pollara 400 AI NIC is now available in an OCP-compliant form factor, built for an open, performant, and programmable AI networking.

It is the only Ethernet-based NIC designed specifically for AI workloads, delivering up to 20% higher performance than competing solutions¹, up to a 25% boost in efficiency with next-generation UEC supported capabilities², up to 50% lower networking costs with multi-plane deployment³, and up to a 50% improvement in network reliability, availability and serviceability⁴ – leading to accelerated job completion times⁵.

Intelligent features minimize GPU cycles and optimize data movement to help prevent bottlenecks and reliability issues in large, scale-out data center and AI infrastructure deployments, giving customers a proven, production-ready solution for high-performance, open networking.

Building Toward an Industry Photonics Standard

Looking ahead, AMD is working with ecosystem partners to define an open photonic interface standard — avoiding proprietary lock-in and enabling high-bandwidth, low-latency interconnects for AI and HPC infrastructure. This initiative will help ensure that future data center architectures can scale efficiently while maintaining vendor choice.

Together, We Advance Open Computing

AMD is dedicated to execution excellence in silicon, systems, and software — and to building an open, collaborative ecosystem that accelerates innovation and market impact. From open hardware standards to collaborative software stacks, AMD is showcasing that open ecosystems thrive and the future of data center and AI infrastructure will be built together.

Visit AMD at the OCP Global Summit booth #B31 to explore our latest open solutions and learn how we can advance the future of AI infrastructure — together.

Footnotes

Testing was conducted by AMD Performance Labs as of 3/30/2025 on the Pollara 400 vs Broadcom Thor2 on a test system using identical GPU cluster configurations (16 SuperMicro server nodes, 128 AMD Instinct™ MI300 GPU, 128 Pollara or Thor2 NICs, rail based network topology using 64-port x 400G Broadcom Tomahawk5 based Ethernet switching).
CPU on all are: 2P Intel® Xeon® Platinum 8468
Memory: 2048G DDDR5 4800mhz 64GB dual rank dims, OS: Ubuntu® 22.0.4, Kernel 6.5.0-45-generic LTS, 2 rear m.2 NVME, Bios version: 2.3.5., ROCm™ 6.3.0-39.
Results may vary due to factors including but not limited to software versions, network speeds and system configurations1. (PEN-013)
Testing conducted by AMD Performance Labs as of [28th April 2025] on the [AMD Pensando™ Pollara 400 AI NIC ], on a production system comprising of: 2 Nodes of 8xMI300X AMD GPUs (16 GPUs): Broadcom Tomahawk-4 based leaf switch (64x400G) from MICAS network; CLOS Topology; AMD Pensando Pollara AI NIC – 16 NICs; CPU Model in each of the 2 nodes - Dual socket 5th gen Intel® Xeon® 8568 - 48 core CPU with PCIe® Gen-5 BIOS version 1.3.6 ; Mitigation - Off (default) System profile setting - Performance (default) SMT- enabled (default); Operating System Ubuntu 22.04.5 LTS, Kernel 5.15.0-139-generic.
Following operation were measured: Allreduce
Average 25% for All-Reduce operations with 4QP and using UEC ready RDMA vs the RoCEv2 for multiple different message size samples (512MB, 1GB, 2GB, 4GB, 8GB, 16GB).
The results are based on the average at least 8 test runs. (PEN-016)
AMD comparison and pricing as of July 6, 2025, for network fabric costs to support 128,000 GPUs. Comparison of a Pollara NIC with multiplane fabric and packet spray on an 800G Tomahawk 5–based multiplane design versus a generic fat-tree fabric built on fully scheduled, big-buffer (Jericho3/Ramon3) 800G switching platforms. The generic system is assumed to use a competitive NIC, with NIC costs considered comparable. The Pollara-based design is estimated to deliver up to 58% network switching cost savings by enabling the use of more cost-effective Tomahawk 5–based switching in a multiplane architecture. .AMD comparison and pricing as of 4/23/2025 of a Tomahawk 5 system with Pensando Pollara NIC featuring exclusive multiplane fabric and packet spray versus a generic big-buffer 800G switching platform; the generic system woud employ a competitive NIC, costs of NICs are assumed to be comparable. Deploying Pollara with multi-fabric support and packet spray, allows customers to build cost-effective multiplane network fabrics, instead of a fat-tree design using less network switches to deliver the same amount of network bandwidth across the fabric, and dramatically reducing both switch platform cost, and cost associated with cables, optics.
- Fat-Tree Big Buffer Fully Scheduled Network (Leaf/Spine/Core) estimated Cost: $1.22B
- 3556 leaf (Jericho3-AI) units at $104,998 each = $373M
- 1557 spine/core (Ramon3) units at $147,998 each = $247M
- 128K AOC-10m cables at $1059 each = $136M
- 568,889 QDD-SR4-400G transceivers at $819 each = $466M
- Total (Switching & Optics) = $1.22B
- Naddod Tomahawk5 800G Multiplane Fabric Network estimated Cost: $511M
- 3,000 Leaf and Spine Units (Naddod N9600-640C) at $26,999 each = $81M
- 384K (QDD-SR4-400G) transceivers at $819 each=$313M
- 64K Switch (OSFP-2x400G-DR4) transceivers for NIC connections at $759 each = $48M
- 256K MPO Cables at $26 each = $6.6M
- 2K Optical Shuffle box, modules and internal cables at $30K per rack =$60M
- Total (Switching & Optics) = $511M
Prices subject to change. Comparison for specific network configurations only, and may not be representative of all possible network configurations and comparisons. (PEN-018)
Testing conducted by AMD Performance Labs as of [15 September 2025] on the AMD Pensando Pollara AI NIC, on a test system comprising of SMC-300X server for GPU-GPU communication: 2x AMD Pensando Pollara AI NIC, 2P AMD EPYC 9454 48-Core -2P Processor, 8x AMD Instinct MI300X GPU, Ubuntu 22.04.5 LTS, kernel 5.15.0-139-generic, ROCm 6.4.1.0-83-69b59e5. Testing running Llama-3.1-8B, Model Configuration: SEQ_LEN=2048, TP=1, PP=1, CP=1,FP8=1, MBS=10, GBS = 5120. Iteration = 2, No. of paths/QP : 128. Results may vary based on factors including but not limited to system configuration and software settings. (PEN-019)
Testing conducted by AMD Performance Labs as of [15 September 2025] on the AMD Pensando Pollara AI NIC running Llama 3.1-405B @ 64 global batch size (GBS) with 8K sequence length, on a test system comprising of 8 node SMC-300X server for GPU-GPU communication using 2x AMD Pensando Pollara AI NIC or 2x Nvidia CX-7, 2P AMD EPYC 9454 48-Core 2P - Processor, 8x AMD Instinct MI300X GPUs, Ubuntu 22.04.5 LTS, kernel 5.15.0-139-generic, ROCm 6.4.1.0-83-69b59e5
Following operation are part of gateway function
Configuration : Num layers=4, Data Type=BF16, DCN - TP=1, PP=1, SP=1, DP=1, FSDP=-1, ICI - TP=1, PP=1, SP=1, DP=1, FSDP=8.
AINIC container: jax-private:rocm6.4.0-jax0.5.0-py3.10.12-tedev2.1-20250801_training. Results may vary based on factors including but not limited to system configuration and software settings. (PEN-020)

AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective owners.

Article By

AMD News

Data Center

Business Systems

Personal & Gaming

Embedded

Resources

GPU Accelerators

Adaptive Accelerators

DPU Accelerators

Ethernet Adapters

Workstations

Desktops

Laptops

Resources

Adaptive SoCs & FPGAs

System-on-Modules (SOMs)

Technologies

Resources

Evaluation Boards & Kits

Processor Tools

Graphics Tools & Apps

Adaptive SoC & FPGA Tools

Intellectual Property & Apps

GPU Accelerator Tools & Apps

Ethernet Adapter Tools

Overview

For Data Center & Cloud

For Edge & Endpoints

For Developers

Industries

Industries

Industries

Industries

Industries

Workloads

Gaming

Systems

Technologies

Resources

EPYC Processors

Radeon Graphics & AMD Chipsets

Adaptive SoCs & FPGAs

Alveo Accelerators & Kria SOMs

Ryzen Processors

Ethernet Adapters

Overview

Processors

Accelerators

Embedded Products

Graphics

Overview

Resources by Product

Resources by Type

About Our Partners

AMD Global Support

Processors & Graphics

Accelerators

Adaptive SoCs & FPGAs

Gaming & Personal Computing

Adaptive & Embedded Computing

Get AMD Fan Gear

Shop Our Retail Partners

AMD at OCP 2025: Driving Open Ecosystems and Scalable Compute for the AI Era

Building the Foundation: A Year of Collaborative Progress

AMD Showcases “Helios” Rack-Scale Platform

Expanding Networking Choice: Upcoming AI NIC Innovations

Building Toward an Industry Photonics Standard

Together, We Advance Open Computing

Footnotes

Article By