Setting New Standards
Since their inception, AMD Instinct™ accelerators have provided data center customers and those looking to lean into the possibilities of AI with performance, efficiency, and scalability. Each generation has set new standards, delivered industry-leading specifications, and helped to optimize performance and reduce TCO.1
Now, with the introduction of AMD Instinct™ MI350 Series GPUs to market as part of the recent Advancing AI event, AMD is raising the expectation once again.
AI Driven, HPC Optimized, Leadership Performance
Now’s the time to introduce your customers to the new AMD Instinct™ MI350X and AMD Instinct™ MI355X GPUs and their respective platforms, each built on the cutting-edge 4th Gen AMD CDNA™ architecture and boasting up to 288GB HBM3E memory capacity and 8TB/s of bandwidth. Designed for everything from massive AI model training and high-speed inference to complex HPC workloads, AMD Instinct MI350X GPUs deliver up to 2.05X the FP6 performance of Nvidia’s B200 platform,2 while AMD Instinct MI355X GPUs boast a 2X FP6 advantage over the GB200,3 setting a new bar for density, efficiency, and throughput at scale.
As customer and infrastructure demands rise, AMD has designed these accelerators to keep pace. These new air-cooled GPUs integrate seamlessly with previous-generation AMD Instinct™ MI300X and MI325X infrastructure, making them a pain-free, cost-effective upgrade for virtually any scenario where higher density computing is a necessity.
Both AMD Instinct GPUs offer expanded support for FP6 and FP4 datatypes in addition to enhanced support for FP16 and FP8 processing, delivering uncompromising computational throughput and memory bandwidth utilization while maximizing energy efficiency. AMD Instinct MI350 Series GPUs also offer up to 4X better performance running FP4 compared to AMD Instinct MI300X GPUs running FP8,4 positioning them to deliver incredible performance when it comes to advanced generative AI models and pushing the boundaries of the space further than ever.
Specification |
AMD Instinct™ MI350X GPU |
AMD Instinct™ MI350X Platform |
AMD Instinct™ MI355X GPU |
AMD Instinct™ MI355X Platform |
GPUs |
AMD Instinct MI350X OAM |
8 x AMD Instinct MI350X OAM |
AMD Instinct MI355X OAM |
8 x AMD Instinct MI355X OAM |
GPU Architecture |
AMD CDNA™ 4 |
AMD CDNA™ 4 |
AMD CDNA™ 4 |
AMD CDNA™ 4 |
Dedicated Memory Size |
288 GB HBM3E |
2.3 TB HBM3E |
288 GB HBM3E |
2.3 TB HBM3E |
Memory Bandwidth |
8 TB/s |
8 TB/s per OAM |
8 TB/s |
8 TB/s per OAM |
Peak Half Precision (FP16) Performance* |
4.6 PFLOPS |
36.8 PFLOPS |
4.6 PFLOPS |
36.8 PFLOPS |
Peak Eight-bit Precision (FP8) Performance* |
9.228 PFLOPs |
72 PFLOPs |
9.228 PFLOPs |
72 PFLOPs |
Peak Six-bit Precision (FP6) Performance* |
18.45 PFLOPS |
148 PFLOPS |
18.45 PFLOPS |
148 PFLOPS |
Peak Four-bit Precision (FP4) Performance* |
18.45 PFLOPS |
148 PFLOPS |
18.45 PFLOPS |
148 PFLOPS |
Cooling |
Air Cooled |
Air Cooled |
Direct Liquid Cooled |
Direct Liquid Cooled |
Typical Board Power |
1000W Peak |
1000W Peak per OAM |
1400W Peak |
1400W Peak Per OAM |
*with structured sparsity
Integrated With Next-Generation AMD ROCm™ Software
Building on the AMD commitment to open-source innovation, AMD Instinct MI350 Series GPUs are integrated with the next-generation AMD ROCm™ software stack, the industry’s premier open alternative for AI and HPC workloads.
Paired with the launch of these new accelerators, the latest AMD ROCm software enhancements take AI workloads to the next level, further optimizing AI inferencing, training, and framework compatibility, delivering high throughput, low latency results for demanding workloads such as natural language processing (NLP), computer vision, and more.
ROCm software delivers Day-0 support for AI platforms and models provided by leaders such as OpenAI, Meta, PyTorch, Hugging Face, xAI, DeepSeek, and more, thanks to strategic and deep collaborations with key partners. All of this adds up to ensuring AMD Instinct GPUs are optimized to run the latest AI models and frameworks as they launch, enabling both developers and businesses to accelerate how they integrate AI into their workflows.
There’s a reason industry titans like Microsoft and Meta trust AMD Instinct GPUs to power large-scale AI deployments for models like Llama 405B and GPT. Speak to your AMD representative or visit amd.com to learn more and help enable your customers with the power of AMD Instinct accelerators.
AMD Arena
Enhance your AMD product knowledge with training on AMD Ryzen™ PRO, AMD EPYC™, AMD Instinct™, and more.
Subscribe
Get monthly updates on AMD’s latest products, training resources, and Meet the Experts webinars.

Related Articles
Related Training Courses
Related Webinars
Footnotes
- MI325-001A - Calculations conducted by AMD Performance Labs as of September 26th, 2024, based on current specifications and /or estimation. The AMD Instinct™ MI325X OAM accelerator will have 256GB HBM3E memory capacity and 6 TB/s GPU peak theoretical memory bandwidth performance. Actual results based on production silicon may vary.
The highest published results on the NVidia Hopper H200 (141GB) SXM GPU accelerator resulted in 141GB HBM3E memory capacity and 4.8 TB/s GPU memory bandwidth performance. https://nvdam.widen.net/s/nb5zzzsjdf/hpc-datasheet-sc23-h200-datasheet-3002446
The highest published results on the NVidia Blackwell HGX B100 (192GB) 700W GPU accelerator resulted in 192GB HBM3E memory capacity and 8 TB/s GPU memory bandwidth performance.
The highest published results on the NVidia Blackwell HGX B200 (192GB) GPU accelerator resulted in 192GB HBM3E memory capacity and 8 TB/s GPU memory bandwidth performance.
Nvidia Blackwell specifications at https://resources.nvidia.com/en-us-blackwell-architecture?_gl=1*1r4pme7*_gcl_aw*R0NMLjE3MTM5NjQ3NTAuQ2p3S0NBancyNkt4QmhCREVpd0F1NktYdDlweXY1dlUtaHNKNmhPdHM4UVdPSlM3dFdQaE40WkI4THZBaW
- Based on calculations by AMD Performance Labs in May 2025, for the 8 GPU AMD Instinct™ MI350X / MI355X Platforms to determine the peak theoretical precision performance when comparing FP64, FP32, TF32, FP16, FP8, FP6, FP4, and INT8 datatypes with Matrix, Tensor, Vector and Sparsity, as applicable vs. NVIDIA HGX Blackwell B200 accelerator platform. Results may vary based on configuration, datatype, workload. MI350-010
- Based on calculations by AMD Performance Labs in May 2025, for the 8 GPU AMD Instinct™ MI355X Platforms to determine the peak theoretical precision performance when comparing FP64, FP32, TF32, FP16, FP8, FP6, FP4, and INT8 datatypes with Matrix, Tensor, Vector and Sparsity, as applicable vs. NVIDIA Grace Blackwell GB200 NVL72 8 GPU platform. Server manufacturers may vary configurations, yielding different results. Results may vary based on the use of the latest drivers and optimizations. MI350-018
- Based on calculations by AMD Performance Labs in May 2025, to determine the peak theoretical precision performance of eight (8) AMD Instinct™ MI355X and MI350X GPUs (Platform) and eight (8) AMD Instinct MI325X, MI300X, MI250X and MI100 GPUs (Platform) using the FP16, FP8, FP6 and FP4 datatypes with Matrix. Server manufacturers may vary configurations, yielding different results. Results may vary based on use of the latest drivers and optimizations. MI350-004
- MI325-001A - Calculations conducted by AMD Performance Labs as of September 26th, 2024, based on current specifications and /or estimation. The AMD Instinct™ MI325X OAM accelerator will have 256GB HBM3E memory capacity and 6 TB/s GPU peak theoretical memory bandwidth performance. Actual results based on production silicon may vary.
The highest published results on the NVidia Hopper H200 (141GB) SXM GPU accelerator resulted in 141GB HBM3E memory capacity and 4.8 TB/s GPU memory bandwidth performance. https://nvdam.widen.net/s/nb5zzzsjdf/hpc-datasheet-sc23-h200-datasheet-3002446
The highest published results on the NVidia Blackwell HGX B100 (192GB) 700W GPU accelerator resulted in 192GB HBM3E memory capacity and 8 TB/s GPU memory bandwidth performance.
The highest published results on the NVidia Blackwell HGX B200 (192GB) GPU accelerator resulted in 192GB HBM3E memory capacity and 8 TB/s GPU memory bandwidth performance.
Nvidia Blackwell specifications at https://resources.nvidia.com/en-us-blackwell-architecture?_gl=1*1r4pme7*_gcl_aw*R0NMLjE3MTM5NjQ3NTAuQ2p3S0NBancyNkt4QmhCREVpd0F1NktYdDlweXY1dlUtaHNKNmhPdHM4UVdPSlM3dFdQaE40WkI4THZBaW - Based on calculations by AMD Performance Labs in May 2025, for the 8 GPU AMD Instinct™ MI350X / MI355X Platforms to determine the peak theoretical precision performance when comparing FP64, FP32, TF32, FP16, FP8, FP6, FP4, and INT8 datatypes with Matrix, Tensor, Vector and Sparsity, as applicable vs. NVIDIA HGX Blackwell B200 accelerator platform. Results may vary based on configuration, datatype, workload. MI350-010
- Based on calculations by AMD Performance Labs in May 2025, for the 8 GPU AMD Instinct™ MI355X Platforms to determine the peak theoretical precision performance when comparing FP64, FP32, TF32, FP16, FP8, FP6, FP4, and INT8 datatypes with Matrix, Tensor, Vector and Sparsity, as applicable vs. NVIDIA Grace Blackwell GB200 NVL72 8 GPU platform. Server manufacturers may vary configurations, yielding different results. Results may vary based on the use of the latest drivers and optimizations. MI350-018
- Based on calculations by AMD Performance Labs in May 2025, to determine the peak theoretical precision performance of eight (8) AMD Instinct™ MI355X and MI350X GPUs (Platform) and eight (8) AMD Instinct MI325X, MI300X, MI250X and MI100 GPUs (Platform) using the FP16, FP8, FP6 and FP4 datatypes with Matrix. Server manufacturers may vary configurations, yielding different results. Results may vary based on use of the latest drivers and optimizations. MI350-004