AMD at OCP 2025: Driving Open Ecosystems and Scalable Compute for the AI Era
Oct 14, 2025

At the OCP Global Summit 2025, AMD is showcasing how open collaboration and open standards are reshaping the future of data center and AI infrastructure. As AI workloads grow from single racks to entire regions, the need for interoperability, scalability, and sustainability has never been greater.
The efforts by AMD, working across the broad OCP community over the past year, reflect a clear vision: open ecosystems thrive. By contributing to open hardware, software, and networking standards, AMD is helping the industry reduce friction, accelerate technology adoption, and enable choice and flexibility at scale.
Building the Foundation: A Year of Collaborative Progress
Together with hyperscalers, ODMs, and fellow OCP members, AMD has advanced multiple OCP initiatives that are already impacting how data centers are designed and deployed.
One major area of progress is platform enablement. As an early participant in the DC-MHS initiative, AMD helped pioneer a modular approach to server design that makes it easier for the industry to adopt new technologies. AMD OCP-compliant server reference designs — supporting today’s AMD EPYC™ 9005 processors and future product generations — integrate standards like DC-MHS, M-CRPS, DC-SCM to support interoperability, vendor neutrality, and scalability in modern data centers, and OCP-compliant NICs, enable faster ecosystem readiness and broader industry choice.
Security has been another critical focus. AMD co-founded Project Caliptra, an open-source hardware root of trust designed to help protect confidential computing workloads. Alongside adoption of OCP S.A.F.E. and leadership in the OCP Attestation Specification, AMD is helping devices prove their trustworthiness through cryptographic evidence — a key requirement for more secure, large-scale deployments.
Networking standards have also gained momentum. AMD is driving UALink and UltraEthernet to deliver open, multi-vendor scale-up and scale-out fabrics, enabling any CPU, any accelerator, at any scale. This work complements the broader AMD commitment to open interconnects that avoid proprietary vendor lock-in.
Beyond these efforts, AMD has:
- Formed the x86 Ecosystem Advisory Group (EAG), bringing together AMD, Intel and key ecosystem partners in a shared commitment to advancing the future of the x86 platform through collaborative decision-making, standardized features, and developer-friendly innovations. Check out the AMD blog marking the one-year anniversary of the x86 EAG and highlighting the notable progress and technology milestones achieved over the past year.
- Expanded its open software ecosystem through ROCm™, open-source libraries, and contributions to the Linux Foundation, PyTorch Foundation, and other open governance bodies — giving developers freedom to build without lock-in.
- Led advancements in RAS (Reliability, Availability, and Serviceability) standards for OCP CPUs and GPUs, addressing large-scale data center pain points in hardware fault management, memory error handling, and debug capabilities.
- Contributed to management and cooling initiatives, from the Open Boot & Management Framework to liquid cooling interoperability projects, supporting sustainable, high-performance infrastructure.
Performance and scalability are only part of the equation. AMD is committed to meeting the industry’s energy efficiency challenge through its 25x30 goal — aiming to deliver 25 times more energy efficiency in AI and HPC by 2030. To achieve this, AMD is driving innovations in cooling systems, liquid cooling interoperability, and energy efficient rack-scale designs that support the focus of OCP on operational sustainability.
Together, these efforts showcase how the contributions of AMD to OCP are enabling more scalable, secure, and sustainable compute — and why open standards are the foundation for the AI era.
AMD Showcases “Helios” Rack-Scale Platform

Representing a major step forward in open, interoperable AI infrastructure, AMD showcased a static display of its “Helios,” rack scale platform for the first time in public. Developed in alignment with the new Open Rack Wide (ORW) specification, introduced by Meta, “Helios” extends the AMD open hardware philosophy from silicon to system to rack.
The AMD “Helios” rack scale platform integrates AMD Instinct GPUs, EPYC CPUs, and open fabrics to offer the industry a flexible, high-performance platform that extends AMD leadership in AI and high-performance computing and provides the foundation to deliver the open, scalable infrastructure that will power the world’s growing AI demands. Check out the AMD "Helios" blog for more information.
Expanding Networking Choice: Upcoming AI NIC Innovations
Continuing the past year of progress on OCP initiatives, the AMD Pensando™ Pollara 400 AI NIC is now available in an OCP-compliant form factor, built for an open, performant, and programmable AI networking.
It is the only Ethernet-based NIC designed specifically for AI workloads, delivering up to 20% higher performance than competing solutions1, up to a 25% boost in efficiency with next-generation UEC supported capabilities2, up to 50% lower networking costs with multi-plane deployment3, and up to a 50% improvement in network reliability, availability and serviceability4 – leading to accelerated job completion times5.
Intelligent features minimize GPU cycles and optimize data movement to help prevent bottlenecks and reliability issues in large, scale-out data center and AI infrastructure deployments, giving customers a proven, production-ready solution for high-performance, open networking.
Building Toward an Industry Photonics Standard
Looking ahead, AMD is working with ecosystem partners to define an open photonic interface standard — avoiding proprietary lock-in and enabling high-bandwidth, low-latency interconnects for AI and HPC infrastructure. This initiative will help ensure that future data center architectures can scale efficiently while maintaining vendor choice.
Together, We Advance Open Computing
AMD is dedicated to execution excellence in silicon, systems, and software — and to building an open, collaborative ecosystem that accelerates innovation and market impact. From open hardware standards to collaborative software stacks, AMD is showcasing that open ecosystems thrive and the future of data center and AI infrastructure will be built together.
Visit AMD at the OCP Global Summit booth #B31 to explore our latest open solutions and learn how we can advance the future of AI infrastructure — together.
Footnotes
- Testing was conducted by AMD Performance Labs as of 3/30/2025 on the Pollara 400 vs Broadcom Thor2 on a test system using identical GPU cluster configurations (16 SuperMicro server nodes, 128 AMD Instinct™ MI300 GPU, 128 Pollara or Thor2 NICs, rail based network topology using 64-port x 400G Broadcom Tomahawk5 based Ethernet switching).
CPU on all are: 2P Intel® Xeon® Platinum 8468
Memory: 2048G DDDR5 4800mhz 64GB dual rank dims, OS: Ubuntu® 22.0.4, Kernel 6.5.0-45-generic LTS, 2 rear m.2 NVME, Bios version: 2.3.5., ROCm™ 6.3.0-39.
Results may vary due to factors including but not limited to software versions, network speeds and system configurations1. (PEN-013) - Testing conducted by AMD Performance Labs as of [28th April 2025] on the [AMD Pensando™ Pollara 400 AI NIC ], on a production system comprising of: 2 Nodes of 8xMI300X AMD GPUs (16 GPUs): Broadcom Tomahawk-4 based leaf switch (64x400G) from MICAS network; CLOS Topology; AMD Pensando Pollara AI NIC – 16 NICs; CPU Model in each of the 2 nodes - Dual socket 5th gen Intel® Xeon® 8568 - 48 core CPU with PCIe® Gen-5 BIOS version 1.3.6 ; Mitigation - Off (default) System profile setting - Performance (default) SMT- enabled (default); Operating System Ubuntu 22.04.5 LTS, Kernel 5.15.0-139-generic.
Following operation were measured: Allreduce
Average 25% for All-Reduce operations with 4QP and using UEC ready RDMA vs the RoCEv2 for multiple different message size samples (512MB, 1GB, 2GB, 4GB, 8GB, 16GB).
The results are based on the average at least 8 test runs. (PEN-016) - AMD comparison and pricing as of July 6, 2025, for network fabric costs to support 128,000 GPUs. Comparison of a Pollara NIC with multiplane fabric and packet spray on an 800G Tomahawk 5–based multiplane design versus a generic fat-tree fabric built on fully scheduled, big-buffer (Jericho3/Ramon3) 800G switching platforms. The generic system is assumed to use a competitive NIC, with NIC costs considered comparable. The Pollara-based design is estimated to deliver up to 58% network switching cost savings by enabling the use of more cost-effective Tomahawk 5–based switching in a multiplane architecture. .AMD comparison and pricing as of 4/23/2025 of a Tomahawk 5 system with Pensando Pollara NIC featuring exclusive multiplane fabric and packet spray versus a generic big-buffer 800G switching platform; the generic system woud employ a competitive NIC, costs of NICs are assumed to be comparable. Deploying Pollara with multi-fabric support and packet spray, allows customers to build cost-effective multiplane network fabrics, instead of a fat-tree design using less network switches to deliver the same amount of network bandwidth across the fabric, and dramatically reducing both switch platform cost, and cost associated with cables, optics.
- Fat-Tree Big Buffer Fully Scheduled Network (Leaf/Spine/Core) estimated Cost: $1.22B
- 3556 leaf (Jericho3-AI) units at $104,998 each = $373M
- 1557 spine/core (Ramon3) units at $147,998 each = $247M
- 128K AOC-10m cables at $1059 each = $136M
- 568,889 QDD-SR4-400G transceivers at $819 each = $466M
- Total (Switching & Optics) = $1.22B
- Naddod Tomahawk5 800G Multiplane Fabric Network estimated Cost: $511M
- 3,000 Leaf and Spine Units (Naddod N9600-640C) at $26,999 each = $81M
- 384K (QDD-SR4-400G) transceivers at $819 each=$313M
- 64K Switch (OSFP-2x400G-DR4) transceivers for NIC connections at $759 each = $48M
- 256K MPO Cables at $26 each = $6.6M
- 2K Optical Shuffle box, modules and internal cables at $30K per rack =$60M
- Total (Switching & Optics) = $511M
Prices subject to change. Comparison for specific network configurations only, and may not be representative of all possible network configurations and comparisons. (PEN-018) - Testing conducted by AMD Performance Labs as of [15 September 2025] on the AMD Pensando Pollara AI NIC, on a test system comprising of SMC-300X server for GPU-GPU communication: 2x AMD Pensando Pollara AI NIC, 2P AMD EPYC 9454 48-Core -2P Processor, 8x AMD Instinct MI300X GPU, Ubuntu 22.04.5 LTS, kernel 5.15.0-139-generic, ROCm 6.4.1.0-83-69b59e5. Testing running Llama-3.1-8B, Model Configuration: SEQ_LEN=2048, TP=1, PP=1, CP=1,FP8=1, MBS=10, GBS = 5120. Iteration = 2, No. of paths/QP : 128. Results may vary based on factors including but not limited to system configuration and software settings. (PEN-019)
- Testing conducted by AMD Performance Labs as of [15 September 2025] on the AMD Pensando Pollara AI NIC running Llama 3.1-405B @ 64 global batch size (GBS) with 8K sequence length, on a test system comprising of 8 node SMC-300X server for GPU-GPU communication using 2x AMD Pensando Pollara AI NIC or 2x Nvidia CX-7, 2P AMD EPYC 9454 48-Core 2P - Processor, 8x AMD Instinct MI300X GPUs, Ubuntu 22.04.5 LTS, kernel 5.15.0-139-generic, ROCm 6.4.1.0-83-69b59e5
Following operation are part of gateway function
Configuration : Num layers=4, Data Type=BF16, DCN - TP=1, PP=1, SP=1, DP=1, FSDP=-1, ICI - TP=1, PP=1, SP=1, DP=1, FSDP=8.
AINIC container: jax-private:rocm6.4.0-jax0.5.0-py3.10.12-tedev2.1-20250801_training. Results may vary based on factors including but not limited to system configuration and software settings. (PEN-020)
AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective owners.
© 2025 Advanced Micro Devices, Inc. All rights reserved.