AMD CDNA 3

AMD CDNA™ 3 is the dedicated compute architecture underlying AMD Instinct™ MI300 Series accelerators. It features advanced packaging with chiplet technologies—designed to reduce data movement overhead and enhance power efficiency.

AMD Instinct MI300A Accelerated Processing Unit

AMD Instinct MI325X Accelerator

Matrix Core Technologies

AMD CDNA 3 includes Matrix Core Technologies that deliver enhanced computational throughput with improved instruction-level parallelism, including support for a broad range of precisions (INT8, FP8, BF16, FP16, TF32, FP32, and FP64) as well as sparse matrix data (i.e. sparsity). 

HBM Memory, Cache & Coherency

AMD Instinct MI300 Series accelerators offer industry-leading HBM3e capacity and memory bandwidth1,2, as well as shared memory and AMD Infinity Cache™ (shared Last Level Cache)—eliminating data copy and improving latency.

Add Alt Text

Unified Fabric

Next-gen AMD Infinity Architecture, along with AMD Infinity Fabric™ technology, enables coherent, high-throughput unification of AMD GPU and CPU chiplet technologies with stacked HBM3 memory in single devices and across multi-device platforms. It also offers enhanced I/O with PCIe® 5 compatibility.

Add Alt Text

AMD CDNA 2

AMD CDNA 2 architecture is designed to accelerate even the most taxing scientific computing workloads and machine learning applications. It underlies AMD Instinct MI200 Series accelerators.

Add Alt Text

AMD CDNA

AMD CDNA architecture is a dedicated architecture for GPU-based compute that was designed to usher in the era of Exascale-class computing. It underlies AMD Instinct MI100 Series accelerators.

AMD Instinct Accelerators

Discover how AMD Instinct accelerators supercharge AI and HPC.

AMD ROCm™ Software

AMD CDNA architecture is supported by AMD ROCm™, an open software stack that includes a broad set of programming models, tools, compilers, libraries, and runtimes for AI and HPC solution development targeting AMD Instinct accelerators. 

Footnotes

©2023 Advanced Micro Devices, Inc. all rights reserved. AMD, the AMD arrow, AMD Instinct, AMD CDNA, Infinity Fabric, ROCm, and combinations thereof, are trademarks of Advanced Micro Devices, Inc. PCIe is a registered trademark of PCI-SIG Corporation. Other names are for informational purposes only and may be trademarks of their respective owners.

  1. Calculations conducted by AMD Performance Labs as of November 7, 2023, for the AMD Instinct™ MI300A APU accelerator 760W (128 GB HBM3) designed with AMD CDNA™ 3 5nm FinFet process technology resulted in 128 GB HBM3 memory capacity and 5.325 TFLOPS peak theoretical memory bandwidth performance. MI300A memory bus interface is 8,192 (1024 bits x 8 die) and memory data rate is 5.2 Gbps for total peak memory bandwidth of 5.325 TB/s (8,192 bits memory bus interface * 5.2 Gbps memory data rate/8). The highest published results on the NVidia Hopper H200 (141GB) SXM GPU accelerator resulted in 141GB HBM3e memory capacity and 4.8 TB/s GPU memory bandwidth performance. https://nvdam.widen.net/s/nb5zzzsjdf/hpc-datasheet-sc23-h200-datasheet-3002446 The highest published results on the NVidia Hopper H100 (80GB) SXM GPU accelerator resulted in 80GB HBM3 memory capacity and 3.35 TB/s GPU memory bandwidth performance. https://resources.nvidia.com/en-us-tensor-core/nvidia-tensor-core-gpu-datasheet Server manufacturers may vary configuration offerings yielding different results. MI300-12
  2. MI325-001A - Calculations conducted by AMD Performance Labs as of September 26th, 2024, based on current specifications and /or estimation. The AMD Instinct™ MI325X OAM accelerator will have 256GB HBM3E memory capacity and 6 TB/s GPU peak theoretical memory bandwidth performance. Actual results based on production silicon may vary.
  3. The highest published results on the NVidia Hopper H200 (141GB) SXM GPU accelerator resulted in 141GB HBM3E memory capacity and 4.8 TB/s GPU memory bandwidth performance.  https://nvdam.widen.net/s/nb5zzzsjdf/hpc-datasheet-sc23-h200-datasheet-3002446
    The highest published results on the NVidia Blackwell HGX B100 (192GB) 700W GPU accelerator resulted in 192GB HBM3E memory capacity and 8 TB/s GPU memory bandwidth performance.
    The highest published results on the NVidia Blackwell HGX B200 (192GB) GPU accelerator resulted in 192GB HBM3E memory capacity and 8 TB/s GPU memory bandwidth performance.
    Nvidia Blackwell specifications at https://resources.nvidia.com/en-us-blackwell-architecture?_gl=1*1r4pme7*_gcl_aw*R0NMLjE3MTM5NjQ3NTAuQ2p3S0NBancyNkt4QmhCREVpd0F1NktYdDlweXY1dlUtaHNKNmhPdHM4UVdPSlM3dFdQaE40WkI4THZB