AMD ROCm 7.0: Built for Developers, Advancing Open Innovation

Sep 16, 2025

AI Is Moving Fast – And so are the challenges

AI innovation is accelerating at an unprecedented pace. Models are scaling into the hundreds of billions of parameters, inference demands are growing, and enterprises need scalable efficient solutions that balance cost and performance. Developers face increasing pressure to keep up with these while ensuring flexibility, portability, and future readiness.

ROCm 7.0: A Big Step Forward in the AI Era

AMD ROCm™ 7.0 software is more than an update – it's a milestone release that addresses these challenges heads-on. Built with a relentless focus on usability, performance, and support for the latest algorithms, ROCm 7.0 empowers both developers and enterprises to move faster, scale smarter, and deploy AI with confidence.

With ROCm 7.0, AMD delivers:

Breakthrough training and inference performance with the AMD Instinct™ MI350 series GPUs
Seamless distributed inference across clusters with support for leading frameworks
Enhanced code portability with HIP 7.0, streamlining development and migration across hardware ecosystems
New enterprise-focused tools to simplify AI infrastructure management and deployment
Popular large-scale MXFP4 and FP8 models quantized with AMD Quark

AMD Instinct MI350 Series GPUs: Breakthrough Performance for Next-Generation AI

AI training and inference workloads, in addition to getting complex, demand more from every GPU. At the core of ROCm 7.0 is full enablement of the AMD Instinct™ MI350 series GPUs– powered by AMD CDNA™ 4 architecture. With high-bandwidth memory and refined compute engines, ROCm 7.0 delivers substantial advancements for AI developers and practitioners.

To make these gains immediately accessible, AMD also provides prebuilt ROCm 7.0 vLLM and SGLang Docker images for AMD Instinct MI355, MI350, MI325, and MI300 GPUs. These images are optimized for MXFP4* and FP8 performance. Alongside these, we are releasing production-ready large-scale MXFP4, FP8 models -- including DeepSeek R1, Llama 3.3 70B, Llama 3.1-405B and more -- optimized for seamless deployment on vLLM and SGLang frameworks. These models are quantized by AMD Quark, our open-source model optimization toolkit, that brings together state-of-the-art optimization techniques, such as quantization and pruning. Together with models natively available in MXFP4, such as gpt-oss-120b and gpt-oss-20b, these Docker and model releases enable developers to quickly run and benchmark popular large-scale models.

These improvements directly translate into faster time-to-results, lower infrastructure costs, and the ability to experiment, fine-tune, and deploy state-of-the-art models more efficiently.

Distributed Inference: Scaling Beyond a Single Node

AI today is not just about single-node performance – it's about scaling across clusters and serving models at massive throughput. ROCm 7.0 makes distributed inference seamless with open-source integration, enabling enterprises to run state-of-the-art AI models at scale across multiple GPUs using popular frameworks.

With expanded attention and reasoning algorithms, support for Mixtures of Experts (MoE), and low-precision formats like FP4, FP6, and FP8, ROCm 7.0 ensures that enterprises and developers can run large-scale inference workloads more efficiently than ever.

Frameworks like SGLang take full advantage of AMD ROCm distributed capabilities, unlocking significant speedups by leveraging Prefill–Decode disaggregation – delivering higher throughput and lower latency compared to single-node inference. More framework support, deeper ecosystem integrations, and advanced distributed inference features are on the way, making AMD ROCm the foundation for next-generation AI deployment at scale.

Enterprise AI: Open and Scalable

With ROCm 7.0, AMD is releasing new tools to help enterprise customers address the growing need for AI infrastructure management. This release delivers two key components:

AMD Resource Manager – simplifying cluster-scale orchestration and optimizing AI workloads across Kubernetes, Slurm, and enterprise environments.
AMD AI Workbench – a flexible environment for deploying, adapting, and scaling AI models, with built-in support for inference, fine-tuning, and integration into enterprise workflows.

By embracing open-source principles, AMD ensures transparency, flexibility, and ecosystem collaboration—helping enterprises build intelligent, autonomous systems that deliver real-world impact.

Get Started Today

ROCm 7.0 makes high-performance AI more accessible than ever. Explore the ROCm AI developer hub for tutorials, guides, and other tools to accelerate your work. Use prebuilt Docker images like SGLang, vLLM, Megatron-LM, and Jax to benchmark performance on AMD Instinct GPUs and dive into the ROCm Documentation page for in-depth best practices and deployment guidance.

Whether you are scaling enterprise AI or experimenting with the latest models, ROCm 7.0 is ready – start building today.

Article By

AMD AI Group

white pearl gradient medium color divider

Related Blogs

View All Blogs

Data Center

Business Systems

Personal & Gaming

Embedded

Resources

GPU Accelerators

Adaptive Accelerators

DPU Accelerators

Ethernet Adapters

Workstations

Desktops

Laptops

Resources

Adaptive SoCs & FPGAs

System-on-Modules (SOMs)

Technologies

Resources

Evaluation Boards & Kits

Processor Tools

Graphics Tools & Apps

Adaptive SoC & FPGA Tools

Intellectual Property & Apps

GPU Accelerator Tools & Apps

Ethernet Adapter Tools

Overview

For Data Center & Cloud

For Edge & Endpoints

For Developers

Industries

Industries

Industries

Industries

Industries

Workloads

Gaming

Systems

Technologies

Resources

EPYC Processors

Radeon Graphics & AMD Chipsets

Adaptive SoCs & FPGAs

Alveo Accelerators & Kria SOMs

Ryzen Processors

Ethernet Adapters

Overview

Processors

Accelerators

Embedded Products

Graphics

Overview

Resources by Product

Resources by Type

About Our Partners

AMD Global Support

Processors & Graphics

Accelerators

Adaptive SoCs & FPGAs

Gaming & Personal Computing

Adaptive & Embedded Computing

Get AMD Fan Gear

Shop Our Retail Partners

AMD ROCm 7.0: Built for Developers, Advancing Open Innovation

AI Is Moving Fast – And so are the challenges

ROCm 7.0: A Big Step Forward in the AI Era

AMD Instinct MI350 Series GPUs: Breakthrough Performance for Next-Generation AI

Distributed Inference: Scaling Beyond a Single Node

Enterprise AI: Open and Scalable

Get Started Today

Article By

Related Blogs