Overview

Deploy small and mid-size models on AMD EPYC™ 9005 server CPUs—on prem or in the cloud—and help maximize value from your computing investments.

Which Hardware Is Best for Different Inference Workloads?

To avoid overprovisioning and get the best return on your AI investments, it’s important to match your model size and latency requirements to the right hardware. The latest generations of AMD EPYC server CPUs can handle a range of AI tasks alongside general-purpose workloads. As model sizes grow, volumes go up, and lower latencies become critical, GPUs become more efficient and cost effective.

Start with CPUs for Cost-Effective Inference

The latest AMD EPYC server CPUs can run small to medium AI inference workloads with sub-second latency, making them a good fit for small and mid-size model sizes. Use CPUs for batch or offline processing where latency is not critical, for mid latency (seconds to minutes), and low latency (500 ms to seconds) response times.

Use GPU Clusters for Large-Scale Deployments

For large models, real-time workloads, and complex, multi-agent pipelines, GPU clusters can deliver high performance per dollar. AMD Instinct platforms use multiple GPUs and are optimal for models with over approximately 450 billion parameters. These GPU clusters can deliver near-real time and real time responses.

AI Inference Workload

Good fit for...

CPUs

CPUs + PCIe-Based GPU

GPU Clusters

Document processing and classification

 

 

Data mining and analytics

 

Scientific simulations

 

 

Translation

 

 

Indexing

 

 

Content moderation

 

 

Predictive maintenance

 

Virtual assistants

 

Chatbots

 

Expert agents

 

Video captioning

 

Fraud detection

 

Decision-making

 

Dynamic pricing

 

Audio and video filtering

 

Financial trading

 

 

Telecommunications and networking

 

 

Autonomous systems

 

 

The AI continuum: what infrastructure works best for inference? infographic cover

Find the Best Inference Hardware

Depending on your workload requirements, either high-core count CPUs alone or a combination of CPUs and GPUs work best for inference. Learn more about which infrastructure fits your model size and latency needs.

5 AI Inference Workloads that Run on a CPU 

The latest AMD EPYC server CPUs can meet the performance requirements of a range of AI workloads, including classic machine learning, computer vision, and AI agents. Read about five popular workloads that run great on CPUs.

5 AI Inference Workloads that Run on a CPU  listicle cover
curved transparent to black top gradient divider

Fast, Efficient Inference with AMD EPYC Server CPUs

Whether deployed in a CPU-only server or used as a host for GPUs executing larger models, AMD EPYC server CPUs are designed with the latest open standard technologies to accelerate enterprise AI inference workloads.

5th Gen AMD EPYC Server CPUs Outperform Intel Xeon 6 in Inference, End-to-End AI, and Machine Learning

Claims compare 5th Gen AMD EPYC 9965 server CPUs versus Intel Xeon 6980P.

Up To
89%
Better Chatbot Performance on DeepSeek³
Up To
33%
Better Inference Performance for Translation Use Case with Llama 3.1 8B⁴
Up To
36%
Better Inference Performance for Translation Use Case on Llama 3.2 1B⁵
Translation on Llama 3.2 1B⁵
~1.36x
Essay on Llama 3.2 1B⁵
~1.27x

5th Gen AMD EPYC 9965

Intel Xeon 6980P

Translation on Llama 3.1 8B⁴
~1.33x
Summarization on GPT-J 6B⁶
~1.28x

5th Gen AMD EPYC 9965

Intel Xeon 6980P

Chatbot on DeepSeek-R1 671B³
~1.89x
Essay on DeepSeek-R1 671B³
~1.71x
Summary on DeepSeek-R1 671B³
~1.41x
Rewrite on DeepSeek-R1 671B³
~1.20x

5th Gen AMD EPYC 9965

Intel Xeon 6980P

5th Gen AMD EPYC 9965

Intel Xeon 6980P

TPCx-AI@SF30 derivative¹⁰
~1.70x
XGBoost (Higgs)¹¹
~1.93x
Facebook AI Similarity Seach (FAISS)¹²
~1.60x

5th Gen AMD EPYC 9965

Intel Xeon 6980P

curved gradient divider

Frequently Asked Questions

First, determine your performance needs. How fast do you need responses in terms of minutes, seconds, or milliseconds? How big are the models you’re running in terms of parameters? You may be able to meet performance requirements simply by upgrading to a 5th Gen AMD EPYC CPU, avoiding the cost of GPU hardware.

If you don’t need responses in real time, batch inference is cost-efficient for large-scale and long-term analysis—for example, analyzing campaign performance or predictive maintenance. Real-time inference that supports interactive use cases like financial trading and autonomous systems may need GPU accelerators. While CPUs alone are excellent for batch inference, GPUs are best for real-time inference.

CPUs alone offer enough performance for inference on models up to ~20 billion parameters and for mid-latency response times (seconds to minutes). This is sufficient for many AI assistants, chatbots, and agents. Consider adding GPU accelerators when models are larger or response times must be faster than this.

The short answer is it depends. Extracting maximum performance for a workload is very workload and expertise dependent. With that said, select 5th Gen AMD EPYC  Server CPUs outperform comparable Intel Xeon 6 in inference for many popular AI workloads, including large language models (DeepSeek-R1 671B),3 medium language models (Llama 3.1 8B4 and GPT-J 6B6), and small language models (Llama 3.2 1B).5

AMD EPYC server CPUs include AMD Infinity Guard which provides a silicon-based set of security features.7 AMD Infinity Guard includes AMD Secure Encrypted Virtualization (AMD SEV), a widely adopted confidential computing solution that uses confidential virtual machines (VMs) to help protect data, AI models, and workloads at runtime.

AMD Powers the Full Spectrum of AI

Match your infrastructure needs to your AI ambitions. AMD offers the broadest AI portfolio, open standards-based platforms, and a powerful ecosystem—all backed by performance leadership.

AMD Instinct™ GPUs

Available in a PCIe form factor or integrated cluster, AMD Instinct™ GPUs bring exceptional efficiency and performance to generative AI, ideal for training massive models and high-speed inference.

AMD Versal™ Adaptive SoCs

This highly integrated compute platform for embedded applications includes real-time CPU cores, programable logic and network on chip (NoC), plus AI engines for machine learning, providing outstanding system-level performance in use cases that demand customized hardware.

Data Security for AI Workloads

As AI fuels data growth, advanced security becomes even more critical. This need is further amplified by increasing emphasis on privacy regulations, data sovereignty, and severe penalties for breaches. Built-in at the silicon level, AMD Infinity Guard offers the security capabilities required for AI, including AMD Secure Encrypted Virtualization (SEV), the industry’s most mature confidential computing solution.7

AMD EPYC Deployment Options

Close-up of a server

Broad Ecosystem for AI On-Premises 

Find enterprise AI hardware from our OEM partners, including servers with high core count and high frequency CPUs, a premier line of GPUs, and interoperable networking solutions.

Mother Board CPU

Scale AI in the Cloud

Get the most from your cloud by choosing AMD technology-based virtual machines (VMs) for AI workloads.

Inference Frameworks for Open Software Development

With AMD ZenDNN and AMD ROCm™ software, developers can optimize their application performance while using their choice of frameworks.

Resources

Technical Articles and Blogs

Get technical details and guidance on using AMD EPYC server CPU features, tools, and tuning for your inference workloads.

AMD TechTalk Podcasts

Hear about the latest trends in AI from leading technology experts.

Subscribe to Data Center Insights from AMD

Request Contact from an AMD EPYC Sales Expert

Footnotes
  1. 9xx5-169: Llama-3.3-70B latency constrained throughput (goodput ) results based on AMD internal testing as of 05/14/2025.Configurations: Llama-3.3-70B, vLLM API server v1.0, data set: Sonnet3.5-SlimOrcaDedupCleaned, TP8, 512 max requests (dynamic batching), latency constrained time to first token (300ms, 400ms, 500ms, 600ms), OpenMP 128, results in tokens/s. 2P AMD EPYC 9575F (128 Total Cores, 400W TDP, production system, 1.5TB 24x64GB DDR5-6400 running at 6000 MT/s, 2 x 25 GbE ConnectX-6 Lx MT2894, 4x 3.84TB Samsung MZWLO3T8HCLS-00A07 NVMe ; Micron_7450_MTFDKCC800TFS 800GB NVMe for OS, Ubuntu 22.04.3 LTS, kernel=5.15.0-117-generic , BIOS 3.2, SMT=OFF, Determinism=power, mitigations=off)  with 8x NVIDIA H100. 2P Intel Xeon 8592+ (128 Total Cores, 350W TDP, production system, 1TB 16x64GB DDR5-5600 , 2 x 25 GbE ConnectX-6 Lx (MT2894), 4x 3.84TB Samsung MZWLO3T8HCLS-00A07 NVMe, Micron_7450_MTFDKBA480TFR 480GB NVMe , Ubuntu 22.04.3 LTS, kernel-5.15.0-118-generic , SMT=OFF, Performance Bias, Mitigations=off) with 8x NVIDIA H100. Results:CPU 300 400 500 600; 8592+ 0 126.43 1565.65 1987.19; 9575F 346.11 2326.21; 2531.38 2572.42; Relative NA 18.40 1.62 1.29. Results may vary due to factors including system configurations, software versions, and BIOS settings. TDP information from ark.intel.com
  2. Parallel draft models (PARD) technology on Llama-3.2-1B-Instruct. See configurations: https://www.amd.com/en/developer/resources/technical-articles/2025/speculative-llm-inference-on-the-5th-gen-amd-epyc-processors-wit.html
  3. 9xx5-152A: Deepseek-R1-671B throughput results based on AMD internal testing as of 01/28/2025. Configurations: llama.cpp framework, 1.58 bit quantization (UD_IQ1_S, MoE at 1.56 bit), batch sizes 1 and 4, 16C Instances, Use Case Input/Output token configurations: [Chatbot = 128/128, Essay = 128/1024, Summary = 1024/128, Rewrite = 1024/1024]. 2P AMD EPYC 9965 (384 Total Cores, 500W TDP, reference system, 3TB 24x128GB DDR5-6400, 2 x 40 GbE Mellanox CX-7 (MT2910) 3.84TB Samsung MZWLO3T8HCLS-00A07 NVMe, Ubuntu® 22.04.3 LTS | 5.15.0-105-generic), SMT=ON, Determinism=power, Mitigations=on) 2P AMD EPYC 9755 (256 Total Cores, 500W TDP, reference system, 3TB 24x128GB DDR5-6400, 2 x 40 GbE Mellanox CX-7 (MT2910) 3.84TB Samsung MZWLO3T8HCLS-00A07 NVMe, Ubuntu® 22.04.3 LTS | 5.15.0-105-generic), SMT=ON, Determinism=power, Mitigations=on) 2P Intel Xeon 6980P (256 Total Cores, 500W TDP, production system, 3TB 24x64GB DDR5-6400, 4 x 1GbE Broadcom NetXtreme BCM5719 Gigabit Ethernet PCIe 3.84TB SAMSUNG MZWLO3T8HCLS-00A07 NVMe, Ubuntu 24.04.2 LTS | 6.13.2-061302-generic, SMT=ON, Performance Bias, Mitigations=on) Results: BS=1 6980P 9755 9965 Rel9755 Rel9965 Chatbot 47.31 61.88 70.344 1.308 1.487 Essay 42.97 56.04 61.608 1.304 1.434 Summary 44.99 59.39 62.304 1.32 1.385 Rewrite 41.8 68.44 55.08 1.637 1.318 BS=4 6980P 9755 Rel9755 Rel9965 Chatbot 76.01 104.46 143.496 1.374 1.888 Essay 67.89 93.68 116.064 1.38 1.71 Summary 70.88 103.39 99.96 1.459 1.41 Rewrite 65 87.9 78.12 1.352 1.202 Results may vary due to factors including system configurations, software versions, and BIOS settings.
  4. 9xx5-156: Llama3.1-8B throughput results based on AMD internal testing as of 04/08/2025. Llama3.1-8B configurations: BF16, batch size 32, 32C Instances, Use Case Input/Output token configurations: [Summary = 1024/128, Chatbot = 128/128, Translate = 1024/1024, Essay = 128/1024]. 2P AMD EPYC 9965 (384 Total Cores), 1.5TB 24x64GB DDR5-6400, 1.0 Gbps NIC, 3.84 TB Samsung MZWLO3T8HCLS-00A07, Ubuntu® 22.04.5 LTS, Linux 6.9.0-060900-generic, BIOS RVOT1004A, (SMT=off, mitigations=on, Determinism=Power), NPS=1, ZenDNN 5.0.1 2P AMD EPYC 9755 (256 Total Cores), 1.5TB 24x64GB DDR5-6400, 1.0 Gbps NIC, 3.84 TB Samsung MZWLO3T8HCLS-00A07, Ubuntu® 22.04.4 LTS, Linux 6.8.0-52-generic, BIOS RVOT1004A, (SMT=off, mitigations=on, Determinism=Power), NPS=1, ZenDNN 5.0.1 2P Xeon 6980P (256 Total Cores), AMX On, 1.5TB 24x64GB DDR5-8800 MRDIMM, 1.0 Gbps Ethernet Controller X710 for 10GBASE-T, Micron_7450_MTFDKBG1T9TFR 2TB, Ubuntu 22.04.1 LTS Linux 6.8.0-52-generic, BIOS 1.0 (SMT=off, mitigations=on Performance Bias), IPEX 2.6.0 Results: CPU 6980P 9755 9965 Summary 1 n/a1.093 Translate 1 1.062 1.334 Essay 1 n/a 1.14 Results may vary due to factors including system configurations, software versions, and BIOS settings.
  5. 9xx5-166: Llama3.2-1B throughput results based on AMD internal testing as of 04/08/2025. Llama3.3-1B configurations: BF16, batch size 32, 32C Instances, Use Case Input/Output token configurations: [Summary = 1024/128, Chatbot = 128/128, Translate = 1024/1024, Essay = 128/1024]. 2P AMD EPYC 9965 (384 Total Cores), 1.5TB 24x64GB DDR5-6400, 1.0 Gbps NIC, 3.84 TB Samsung MZWLO3T8HCLS-00A07, Ubuntu® 22.04.5 LTS, Linux 6.9.0-060900-generic, BIOS RVOT1004A, (SMT=off, mitigations=on, Determinism=Power), NPS=1, ZenDNN 5.0.1, Python 3.10.2 2P Xeon 6980P (256 Total Cores), AMX On, 1.5TB 24x64GB DDR5-8800 MRDIMM, 1.0 Gbps Ethernet Controller X710 for 10GBASE-T, Micron_7450_MTFDKBG1T9TFR 2TB, Ubuntu 22.04.1 LTS Linux 6.8.0-52-generic, BIOS 1.0 (SMT=off, mitigations=on, Performance Bias), IPEX 2.6.0, Python 3.12.3 Results: CPU 6980P 9965 Summary 1 1.213 Translation 1 1.364 Essay 1 1.271 Results may vary due to factors including system configurations, software versions, and BIOS settings.
  6. 9xx5-158: GPT-J-6B throughput results based on AMD internal testing as of 04/08/2025. GPT-J-6B configurations: BF16, batch size 32, 32C Instances, Use Case Input/Output token configurations: [Summary = 1024/128, Chatbot = 128/128, Translate = 1024/1024, Essay = 128/1024]. 2P AMD EPYC 9965 (384 Total Cores), 1.5TB 24x64GB DDR5-6400, 1.0 Gbps NIC, 3.84 TB Samsung MZWLO3T8HCLS-00A07, Ubuntu® 22.04.5 LTS, Linux 6.9.0-060900-generic, BIOS RVOT1004A, (SMT=off, mitigations=on, Determinism=Power), NPS=1, ZenDNN 5.0.1, Python 3.10.12 2P AMD EPYC 9755 (256 Total Cores), 1.5TB 24x64GB DDR5-6400, 1.0 Gbps NIC, 3.84 TB Samsung MZWLO3T8HCLS-00A07, Ubuntu® 22.04.4 LTS, Linux 6.8.0-52-generic, BIOS RVOT1004A, (SMT=off, mitigations=on, Determinism=Power), NPS=1, ZenDNN 5.0.1, Python 3.10.12 2P Xeon 6980P (256 Total Cores), AMX On, 1.5TB 24x64GB DDR5-8800 MRDIMM, 1.0 Gbps Ethernet Controller X710 for 10GBASE-T, Micron_7450_MTFDKBG1T9TFR 2TB, Ubuntu 22.04.1 LTS Linux 6.8.0-52-generic, BIOS 1.0 (SMT=off, mitigations=on, Performance Bias), IPEX 2.6.0, Python 3.12.3 Results: CPU 6980P 9755 9965 Summary 1 1.034 1.279 Chatbot 1 0.975 1.163 Translate 1 1.021 0.93 Essay 1 0.978 1.108 Caption 1 0.913 1.12 Overall 1 0.983 1.114 Results may vary due to factors including system configurations, software versions, and BIOS settings.
  7. GD-183A AMD Infinity Guard features vary by EPYC™ Processor generations and/or series. Infinity Guard security features must be enabled by server OEMs and/or Cloud Service Providers to operate. Check with your OEM or provider to confirm support of these features. Learn more about Infinity Guard at https://www.amd.com/en/products/processors/server/epyc/infinity-guard.html
  8. 9xx5-002F: SPECrate®2017_int_base comparison based on published scores from www.spec.org as of 12/4/2025. Results and configurations below are in the format of: [processor], [cores], [TDP], [1Ku price in USD], [SPECrate®2017)_int_base score], [SPECrate® 2017)_int_base score / CPU W], [SPECrate® 2017)_int_base score / 1Ku price in USD], [Link to score]
    2P AMD EPYC 9654, 96C, 360W, $8452 USD, 1830, 5.083, 0.217, https://www.spec.org/cpu2017/results/res2025q3/cpu2017-20250727-49206.html
    2P AMD EPYC 9754, 128C, 360W, $10631 USD, 1950, 5.417, 0.183, https://www.spec.org/cpu2017/results/res2023q2/cpu2017-20230522-36617.html
    2P AMD EPYC 9755, 128C, 500W, $10931 USD, 2840, 5.680, 0.260, https://www.spec.org/cpu2017/results/res2025q2/cpu2017-20250324-47223.html
    2P AMD EPYC 9965, 192C, 500W, $11988 USD, 3230, 6.460, 0.269, https://www.spec.org/cpu2017/results/res2025q2/cpu2017-20250324-47086.html
    2P Intel Xeon 6780E, 144C, 330W, $8513 USD, 1410, 4.273, 0.166, https://www.spec.org/cpu2017/results/res2024q3/cpu2017-20240811-44406.html
    2P Intel Xeon 6980P, 128C, 500W, $12460 USD, 2510, 5.020, 0.201, https://www.spec.org/cpu2017/results/res2025q2/cpu2017-20250324-47099.html
    2P Intel Xeon Platinum 8592+, 64C, 350W, $11600 USD, 1130, 3.229, 0.097, https://www.spec.org/cpu2017/results/res2023q4/cpu2017-20231127-40064.html
    SPEC®, SPEC CPU®, and SPECrate® are registered trademarks of the Standard Performance Evaluation Corporation. See www.spec.org for more information. AMD CPU prices as of 12/9/2025. Intel CPU W and prices at https://ark.intel.com/ as of 12/9/2025
  9. 9xx5-001: Based on AMD internal testing as of 9/10/2024, geomean performance improvement (IPC) at fixed-frequency. - 5th Gen EPYC generational ML/HPC Server Workloads IPC Uplift of 1.369x (geomean) using a select set of 24 workloads and is the geomean of representative ML Server Workloads (geomean), and representative HPC Server Workloads (geomean). “Genoa Config (all NPS1) “Genoa” config: EPYC 9654 BIOS TQZ1005D 12c12t (1c1t/CCD in 12+1), FF 3GHz, 12x DDR5-4800 (2Rx4 64GB), 32Gbps xGMI; “Turin” config (all NPS1):   EPYC 9V45 BIOS RVOT1000F 12c12t (1c1t/CCD in 12+1), FF 3GHz, 12x DDR5-6000 (2Rx4 64GB), 32Gbps xGMI Utilizing Performance Determinism and the Performance governor on Ubuntu 22.04 w/ 6.8.0-40-generic kernel OS for all workloads except LAMMPS, HPCG, NAMD, OpenFOAM, Gromacs  which utilize 24.04 w/ 6.8.0-40-generic kernel. SPEC® and SPECrate® are registered trademarks for Standard Performance Evaluation Corporation. Learn more at spec.org.
  10. 9xx5-151: TPCxAI @SF30 Multi-Instance, 32C Instance Size throughput results based on AMD internal testing as of 04/01/2025 running multiple VM instances. The aggregate end-to-end AI throughput test is derived from the TPCx-AI benchmark and as such is not comparable to published TPCx-AI results, as the end-to-end AI throughput test results do not comply with the TPCx-AI Specification. 2P   AMD EPYC 9965 (6067.53 Total AIUCpm, 384 Total Cores, 500W TDP, AMD reference system, 1.5TB 24x64GB DDR5-6400, 2 x 40 GbE Mellanox CX-7 (MT2910), 3.84TB Samsung MZWLO3T8HCLS-00A07 NVMe, Ubuntu® 24.04 LTS kernel 6.13, SMT=ON, Determinism=power, Mitigations=on) 2P AMD EPYC 9755 (4073.42 Total AIUCpm, 256 Total Cores, 500W TDP, AMD reference system, 1.5TB 24x64GB DDR5-6400, 2 x 40 GbE Mellanox CX-7 (MT2910) 3.84TB Samsung MZWLO3T8HCLS-00A07 NVMe, Ubuntu 24.04 LTS kernel 6.13, SMT=ON, Determinism=power, Mitigations=on) 2P Intel Xeon 6980P (3550.50 Total AIUCpm, 256 Total Cores, 500W TDP, Production system, 1.5TB 24x64GB DDR5-6400, 4 x 1GbE Broadcom NetXtreme BCM5719 Gigabit Ethernet PCIe 3.84TB SAMSUNG MZWLO3T8HCLS-00A07 NVMe, Ubuntu 24.04 LTS kernel 6.13, SMT=ON, Performance Bias, Mitigations=on) Results may vary based on factors including but not limited to system configurations, software versions, and BIOS settings. TPC, TPC Benchmark, and TPC-H are trademarks of the Transaction Processing Performance Council.
  11. 9xx5-162: XGBoost (Runs/Hour) throughput results based on AMD internal testing as of 04/08/2025. XGBoost Configurations: v1.7.2, Higgs Data Set, 32 Core Instances, FP32 2P AMD EPYC 9965 (384 Total Cores), 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1.0 Gbps NIC, 3.84 TB Samsung MZWLO3T8HCLS-00A07, Ubuntu® 22.04.5 LTS, Linux 5.15 kernel, BIOS RVOT1004A, (SMT=off, mitigations=on, Determinism=Power), NPS=1 2P AMD EPYC 9755 (256 Total Cores), 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1.0 Gbps NIC, 3.84 TB Samsung MZWLO3T8HCLS-00A07, Ubuntu® 22.04.4 LTS, Linux 5.15 kernel, BIOS RVOT1004A, (SMT=off, mitigations=on, Determinism=Power), NPS=1 2P Xeon 6980P (256 Total Cores), 1.5TB 24x64GB DDR5-8800 MRDIMM, 1.0 Gbps Ethernet Controller X710 for 10GBASE-T, Micron_7450_MTFDKBG1T9TFR 2TB, Ubuntu 22.04.1 LTS Linux 6.8.0-52-generic, BIOS 1.0 (SMT=off, mitigations=on, Performance Bias) Results: CPU Throughput Relative 2P 6980P 400 1 2P 9755 436 1.090 2P 9965 771 1.928 Results may vary due to factors including system configurations, software versions and BIOS settings.
  12. 9xx5-164: FAISS (Runs/Hour) throughput results based on AMD internal testing as of 04/08/2025. FAISS Configurations: v1.8.0, sift1m Data Set, 32 Core Instances, FP32 2P AMD EPYC 9965 (384 Total Cores), 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1.0 Gbps NIC, 3.84 TB Samsung MZWLO3T8HCLS-00A07, Ubuntu® 22.04.5 LTS, Linux 5.15 kernel, BIOS RVOT1004A, (SMT=off, mitigations=on, Determinism=Power), NPS=1 2P AMD EPYC 9755 (256 Total Cores), 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1.0 Gbps NIC, 3.84 TB Samsung MZWLO3T8HCLS-00A07, Ubuntu® 22.04.4 LTS, Linux 5.15 kernel, BIOS RVOT1004A, (SMT=off, mitigations=on, Determinism=Power), NPS=1 2P Xeon 6980P (256 Total Cores), 1.5TB 24x64GB DDR5-8800 MRDIMM, 1.0 Gbps Ethernet Controller X710 for 10GBASE-T, Micron_7450_MTFDKBG1T9TFR 2TB, Ubuntu 22.04.1 LTS Linux 6.8.0-52-generic, BIOS 1.0 (SMT=off, mitigations=on, Performance Bias) Results: Throughput Relative 2P 6980P 36.63 1 2P 9755 46.86 1.279 2P 9965 58.6 1.600 Results may vary due to factors including system configurations, software versions and BIOS settings.
  13. 9xx5-012: TPCxAI @SF30 Multi-Instance 32C Instance Size throughput results based on AMD internal testing as of 09/05/2024 running multiple VM instances. The aggregate end-to-end AI throughput test is derived from the TPCx-AI benchmark and as such is not comparable to published TPCx-AI results, as the end-to-end AI throughput test results do not comply with the TPCx-AI Specification.
    2P AMD EPYC 9965 (384 Total Cores), 12 32C instances, NPS1, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT1000C (SMT=off, Determinism=Power, Turbo Boost=Enabled)
    2P AMD EPYC 9755 (256 Total Cores), 8 32C instances, NPS1, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT0090F (SMT=off, Determinism=Power, Turbo Boost=Enabled)
    2P AMD EPYC 9654 (192 Total cores) 6 32C instances, NPS1, 1.5TB 24x64GB DDR5-4800, 1DPC, 2 x 1.92 TB Samsung MZQL21T9HCJR-00A07 NVMe, Ubuntu 22.04.3 LTS, BIOS 1006C (SMT=off, Determinism=Power)
    Versus 2P Xeon Platinum 8592+ (128 Total Cores), 4 32C instances, AMX On, 1TB 16x64GB DDR5-5600, 1DPC, 1.0 Gbps NetXtreme BCM5719 Gigabit Ethernet PCIe, 3.84 TB KIOXIA KCMYXRUG3T84 NVMe, , Ubuntu 22.04.4 LTS, 6.5.0-35 generic (tuned-adm profile throughput-performance, ulimit -l 132065548, ulimit -n 1024, ulimit -s 8192), BIOS ESE122V (SMT=off, Determinism=Power, Turbo Boost = Enabled)
    Results:
    CPU Median Relative Generational
    Turin 192C, 12 Inst 6067.531 3.775 2.278
    Turin 128C, 8 Inst 4091.85 2.546 1.536
    Genoa 96C, 6 Inst 2663.14 1.657 1
    EMR 64C, 4 Inst 1607.417 1 NA
    Results may vary due to factors including system configurations, software versions and BIOS settings. TPC, TPC Benchmark and TPC-C are trademarks of the Transaction Processing Performance Council.