AMD Ryzen™ AI Software 1.7 Release

Jan 26, 2026

Ryzen™ AI Software 1.7 release introduces several updates aimed at improving model coverage, reducing friction in local development workflows, and delivering more predictable performance on AMD Application Processing Units (NPU + iGPU). This release adds new architecture support, expands context length for LLMs, integrates Stable Diffusion into the unified Ryzen AI installer, and improves BF16 inference latency.

New Architectures: GPT‑OSS (MoE) and Gemma‑3 4B VLM

RAI 1.7 adds support for the Mixture-of-Experts (Moe) GPTOSS model and the Gemma3 4B vision-language model (VLM), expanding the set of NPU executable architectures available to developers.

MoE efficiency: MoE models route tokens through expert networks, allowing developers to run larger capability models without paying the full compute cost of dense architectures. This can translate into better throughput and more responsive LLM pipelines locally.
VLM capability: The inclusion of VLMs enables multimodal tasks such as image-grounded reasoning, captioning, lightweight visual search, or multimodal agent components.
Broader experimentation: Developers can now benchmark and compare dense, MoE, and VLM architectures under the same NPU constraints, making it easier to choose models for production.

Stable Diffusion Integrated Into the Main Ryzen AI Installer

Stable Diffusion is now built directly into the primary Ryzen AI installer instead of requiring a separate environment.

Predictable environment setup: Developers no longer need to manage SD‑specific Python environments, dependencies, or build steps.
Unified toolchain: LLM, VLM, and SD workflows now live in a common environment, simplifying development for those building mixed-modality applications.
Faster iteration: Faster setup means developers can quickly prototype text-to-image, image-to-image, or hybrid workflows without wrestling with environment fragmentation.

LLM Support for Up to 16K Context Length on Hybrid

Most LLMs in RAI 1.7 now support up to 16K tokens of context when running on the iGPU and NPU (hybrid).

Long‑form reasoning: Developers can build applications involving longer documents, extended multi‑turn conversations, or workflows requiring persistent memory.
Local RAG stacks: Longer context directly improves the effectiveness of on-device retrieval-augmented generation—reducing truncation and improving model grounding.

BF16 Pipeline With ~2x Lower Latency vs. RAI 1.6

The BF16 implementation in RAI 1.7 delivers significantly lower latency, approximately doubling throughput compared to RAI 1.6.

Faster interactive LLMs: Lower token latency improves user perceived responsiveness, especially for chatstyle applications or agent loops.
Better baseline for fine‑tuned models: BF16 improvements benefit both pretrained and custom fine‑tuned models, reducing time‑to‑first‑token and overall inference duration.

RAI 1.7 focuses on the things that smooth out day-to-day development: more model choices (MoE + VLM), a single installer that includes Stable Diffusion, longer LLM context windows on the NPU, and noticeably lower BF16 latency. The result is less friction in setup, quicker feedback loops when you test changes, and a more capable local stack for shipping LLM/VLM features.

For a detailed overview of the new features and enhancements in the 1.7 software release, check out the official release notes.

Subscribe to be notified of future Ryzen AI software updates and get the latest tools and resources to help you explore the limits of what's possible on AI PCs.

Additional Resources

Ryzen AI Video Tutorials 
AMD Ryzen AI Developer Hub
Ryzen AI 1.7 Release Notes
Ryzen AI 1.7 Models
- Hybrid (iGPU + NPU): Ryzen AI 1.7 Hybrid Models on Hugging Face
- NPU-Only: Ryzen AI 1.7 NPU-Only Models on Hugging Face

Visit the Ryzen™ AI documentation to learn more about supported architectures, setup instructions, and how to start building with RAI 1.7.

Article By

AMD AI Group

white pearl gradient medium color divider

Related Blogs

View All Blogs

Data Center

Business Systems

Personal & Gaming

Embedded

Resources

GPU Accelerators

Adaptive Accelerators

DPU Accelerators

Ethernet Adapters

Workstations

Desktops

Laptops

Resources

Adaptive SoCs & FPGAs

System-on-Modules (SOMs)

Technologies

Resources

Evaluation Boards & Kits

Processor Tools

Graphics Tools & Apps

Adaptive SoC & FPGA Tools

Intellectual Property & Apps

GPU Accelerator Tools & Apps

Ethernet Adapter Tools

Overview

For Data Center & Cloud

For Edge & Endpoints

For Developers

Industries

Industries

Industries

Industries

Industries

Workloads

Gaming

Systems

Technologies

Resources

EPYC Processors

Radeon Graphics & AMD Chipsets

Adaptive SoCs & FPGAs

Alveo Accelerators & Kria SOMs

Ryzen Processors

Ethernet Adapters

Overview

Processors

Accelerators

Embedded Products

Graphics

Overview

Resources by Product

Resources by Type

About Our Partners

AMD Global Support

Processors & Graphics

Accelerators

Adaptive SoCs & FPGAs

Gaming & Personal Computing

Adaptive & Embedded Computing

Get AMD Fan Gear

Shop Our Retail Partners

AMD Ryzen™ AI Software 1.7 Release

New Architectures: GPT‑OSS (MoE) and Gemma‑3 4B VLM

Stable Diffusion Integrated Into the Main Ryzen AI Installer

LLM Support for Up to 16K Context Length on Hybrid

BF16 Pipeline With ~2x Lower Latency vs. RAI 1.6

Additional Resources

Article By

Related Blogs