Agentic AI Isn’t One Workload. It’s an End-to-End Workflow.

Jun 29, 2026

Much of the AI infrastructure conversation starts with an AI model running on GPUs. But in practice, AI infrastructure demands are increasingly determined by the workflow around the model.

Agentic AI systems do not simply answer a prompt. They interpret intent, retrieve context, plan next steps, call tools, apply policy, run sandbox code, execute transactions, observe outcomes and return a result.

Each step is a different workload, with all adding up to a varied workflow. Some demand high core density. Some benefit from high frequency and predictable latency. Others depend on memory capacity, I/O, data locality, power efficiency or the ability to host many concurrent services.

As agentic AI becomes more pervasive, infrastructure teams need more than a single compute profile. CIOs and enterprise decision-makers need a portfolio of CPUs matched to the full agentic workflow.

The AMD EPYC™ server CPU portfolio is ideally positioned to play those roles – not as a single CPU with a one-size-fits-all answer, but each as a unique part for agentic AI’s many workloads. (Learn more about CPU importance in agentic AI in my earlier blog, Agentic AI Changes the CPU/GPU Equation.)

Inside the Agentic AI Workflow

When an agent takes on a task, it breaks the goal into steps and works through them, often looping back multiple times before finishing. In a typical sequence, the request hits a gateway where policies are enforced. A planning layer – often running smaller AI models – determines how to route the task. The agent then queries databases, invokes a GPU cluster for deeper reasoning, executes tools based on that reasoning, verifies the output and decides whether to loop again or exit.

This explains why agentic AI should be viewed as an end-to-end workflow, not as a single workload. The right infrastructure strategy starts by mapping each workflow and then assigning the right CPU resources to it.

AMD focuses on every step along the workflow: EPYC CPUs for high-frequency and high-density compute, AMD Instinct™ accelerators for AI inference and training, and Pensando™ networking to help move data predictably.

CPU use across Agentic AI pipeline diagram

Where Latency Matters, Where Throughput Wins – and Where You Need Both

Each stage of the workflow has different needs, which is why we built the AMD EPYC portfolio around a mix of profiles.

Agentic orchestration, sandbox execution, tool calls: When you need many agents simultaneously running sandbox code (e.g., Python), calling APIs or querying databases, core density can matter more than clock speed. Our 5th Gen AMD EPYC™ server CPUs offer up to 192 cores and 384 threads with simultaneous multithreading. Later this year, our next-generation EPYC processors, codenamed “Venice,” will push that to 256 cores and 512 threads.
Tool execution on enterprise applications: The ability to call the tools or enterprise applications makes agents useful. CPUs with a broad set of core counts combined with high performance handle the volume and variety of incoming requests. The AMD EPYC™ 9005 family of processors delivers on this balance with 8 to 192 cores and up to 640GB/s of memory bandwidth, with “Venice” extending core/thread count by 1.3x and memory bandwidth by 2.5x.
Reasoning with inference: To provide the intelligence agents need to get work done, they rely on inference. Large language models predominantly run on GPUs with a host CPU keeping the GPUs fully utilized. To keep accelerators busy, host-node CPUs often benefit from strong per-core performance, high-frequencies and the right balance of cores (sometimes fewer are needed than you might think), memory bandwidth, I/O and networking. The correct mix in the host-node CPU can keep GPU clusters fed with instructions so each cluster delivers as many tokens as possible. The AMD EPYC™ 9575F processor delivers on this high single-core performance with 64 cores capable of running at up to 5Ghz. “Venice” will further extend EPYC CPUs’ high-frequency offerings.

The Legacy Challenge

In conversations with enterprise customers, a couple of patterns stand out.

First, many standardize their CPU infrastructure purchases around legacy specifications, such as using 16- and 32-core CPUs. Agentic workflows need higher core counts for some agentic stages, higher frequencies for others – and customers need the flexibility to configure for both. The mindset should shift from a single CPU standard to a portfolio matched to the agentic workflow.

Second, there’s a multiplier effect on enterprise applications and inference servers that comes as agents become greater users of existing IT infrastructure. Once you give employees the ability to build and deploy their own agents, agentic adoption grows rapidly. IT planning teams should ask what happens to their infrastructure – examples include databases; platforms for enterprise resource planning and customer relationship management, business intelligence, identity management; and inference servers – when agents dramatically increase usage.

The Question for CIOs

Agentic AI is changing how enterprises size their infrastructure. IT leaders who treat it as a monolithic problem – one GPU strategy or a one-size-fits-all CPU – will likely hit challenges. But as agents proliferate, those who plan for a diverse end-to-end workflow with different compute needs at each stage can scale more efficiently.

The question worth asking isn’t how many CPUs or GPUs your business needs for agentic AI. It’s whether you’re matching infrastructure to the way agentic AI works with its many stages across workloads. If you map those stages early and choose the right compute profile for each, your business will be well positioned for speed and efficiency as they scale.

Article By

Dan McNamara

Senior Vice President and General Manager, Compute & Enterprise AI

Dan McNamara is senior vice president and general manager of Compute and Enterprise AI at AMD. He leads the company’s high-performance server and enterprise AI business across cloud, enterprise, high-performance computing, sovereign AI and partner ecosystems. McNamara holds bachelor’s and master’s degrees in electrical engineering from Worcester Polytechnic Institute.

AMD News

white pearl gradient medium color divider

Related Blogs

View All Blogs

Server CPUs

Business Systems

Personal & Gaming

Embedded

Resources

GPU Accelerators

Adaptive Accelerators

DPU Accelerators

Ethernet Adapters

Workstations

Desktops

Laptops

Resources

Adaptive SoCs & FPGAs

System-on-Modules (SOMs)

Technologies

Resources

Evaluation Boards & Kits

Processor Tools

Graphics Tools & Apps

Adaptive SoC & FPGA Tools

Intellectual Property & Apps

GPU Accelerator Tools & Apps

Ethernet Adapter Tools

Overview

For Data Center & Cloud

For Edge & Endpoints

For Developers

Industries

Industries

Industries

Industries

Industries

Workloads

Gaming

Systems

Technologies

Resources

EPYC Processors

Radeon Graphics & AMD Chipsets

Adaptive SoCs & FPGAs

Alveo Accelerators & Kria SOMs

Ryzen Processors

Ethernet Adapters

Overview

Processors

Accelerators

Embedded Products

Graphics

Overview

Resources by Product

Resources by Type

About Our Partners

AMD Global Support

Processors & Graphics

Accelerators

Adaptive SoCs & FPGAs

Gaming & Personal Computing

Adaptive & Embedded Computing

Get AMD Fan Gear

Shop Our Retail Partners

Agentic AI Isn’t One Workload. It’s an End-to-End Workflow.

Inside the Agentic AI Workflow

Where Latency Matters, Where Throughput Wins – and Where You Need Both

The Legacy Challenge

The Question for CIOs

Article By

Related Blogs

AMD.com Feedback