AI & ML
Database
Energy Efficiency
HPC
Other Workloads


5th Generation AMD EPYC™ Processors
Advancing AI-enabled, business-critical data center workloads
Resources

Stay Connected
Sign up to receive the latest data center news or get in touch with an AMD Sales representative.
Footnotes
- 9xx5-012: TPCxAI @SF30 Multi-Instance 32C Instance Size throughput results based on AMD internal testing as of 09/05/2024 running multiple VM instances. The aggregate end-to-end AI throughput test is derived from the TPCx-AI benchmark and as such is not comparable to published TPCx-AI results, as the end-to-end AI throughput test results do not comply with the TPCx-AI Specification. 2P AMD EPYC 9965 (384 Total Cores), 12 32C instances, NPS1, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT1000C (SMT=off, Determinism=Power, Turbo Boost=Enabled); 2P AMD EPYC 9755 (256 Total Cores), 8 32C instances, NPS1, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT0090F (SMT=off, Determinism=Power, Turbo Boost=Enabled); 2P AMD EPYC 9654 (192 Total cores) 6 32C instances, NPS1, 1.5TB 24x64GB DDR5-4800, 1DPC, 2 x 1.92 TB Samsung MZQL21T9HCJR-00A07 NVMe, Ubuntu 22.04.3 LTS, BIOS 1006C (SMT=off, Determinism=Power); Versus 2P Xeon Platinum 8592+ (128 Total Cores), 4 32C instances, AMX On, 1TB 16x64GB DDR5-5600, 1DPC, 1.0 Gbps NetXtreme BCM5719 Gigabit Ethernet PCIe, 3.84 TB KIOXIA KCMYXRUG3T84 NVMe, , Ubuntu 22.04.4 LTS, 6.5.0-35 generic (tuned-adm profile throughput-performance, ulimit -l 132065548, ulimit -n 1024, ulimit -s 8192), BIOS ESE122V (SMT=off, Determinism=Power, Turbo Boost = Enabled). Results may vary due to factors including system configurations, software versions and BIOS settings. TPC, TPC Benchmark and TPC-C are trademarks of the Transaction Processing Performance Council.
- 9xx5-009: Llama3.1-8B throughput results based on AMD internal testing as of 09/05/2024. Llama3-8B configurations: IPEX.LLM 2.4.0, NPS=2, BF16, batch size 4, Use Case Input/Output token configurations: [Summary = 1024/128, Chatbot = 128/128, Translate = 1024/1024, Essay = 128/1024, Caption = 16/16].
2P AMD EPYC 9965 (384 Total Cores), 6 64C instances 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1 DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.3 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192) , BIOS RVOT1000C, (SMT=off, Determinism=Power, Turbo Boost=Enabled), NPS=2; 2P AMD EPYC 9755 (256 Total Cores), 4 64C instances , 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu 22.04.3 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT1000C (SMT=off, Determinism=Power, Turbo Boost=Enabled), NPS=2; 2P AMD EPYC 9654 (192 Total Cores) 4 48C instances , 1.5TB 24x64GB DDR5-4800, 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.4 LTS, 5.15.85-051585-generic (tuned-adm profile throughput-performance, ulimit -l 1198117616, ulimit -n 500000, ulimit -s 8192), BIOS RVI1008C (SMT=off, Determinism=Power, Turbo Boost=Enabled), NPS=2; Versus 2P Xeon Platinum 8592+ (128 Total Cores), 2 64C instances , AMX On, 1TB 16x64GB DDR5-5600, 1DPC, 1.0 Gbps NetXtreme BCM5719 Gigabit Ethernet PCIe, 3.84 TB KIOXIA KCMYXRUG3T84 NVMe®, Ubuntu 22.04.4 LTS 6.5.0-35-generic (tuned-adm profile throughput-performance, ulimit -l 132065548, ulimit -n 1024, ulimit -s 8192), BIOS ESE122V (SMT=off, Determinism=Power, Turbo Boost = Enabled).
Results may vary due to factors including system configurations, software versions and BIOS settings.
- 9xx5-040A: XGBoost (Runs/Hour) throughput results based on AMD internal testing as of 09/05/2024. XGBoost Configurations: v2.2.1, Higgs Data Set, 32 Core Instances, FP32 2P AMD EPYC 9965 (384 Total Cores), 12 x 32 core instances, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.4 LTS, 6.8.0-45-generic (tuned-adm profile throughput-performance, ulimit -l 198078840, ulimit -n 1024, ulimit -s 8192), BIOS RVOT1000C (SMT=off, Determinism=Power, Turbo Boost=Enabled), NPS=1 2P AMD EPYC 9755 (256 Total Cores), 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198094956, ulimit -n 1024, ulimit -s 8192), BIOS RVOT0090F (SMT=off, Determinism=Power, Turbo Boost=Enabled), NPS=1 2P AMD EPYC 9654 (192 Total cores), 1.5TB 24x64GB DDR5-4800, 1DPC, 2 x 1.92 TB Samsung MZQL21T9HCJR-00A07 NVMe®, Ubuntu 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198120988, ulimit -n 1024, ulimit -s 8192), BIOS TTI100BA (SMT=off, Determinism=Power), NPS=1 Versus 2P Xeon Platinum 8592+ (128 Total Cores), AMX On, 1TB 16x64GB DDR5-5600, 1DPC, 1.0 Gbps NetXtreme BCM5719 Gigabit Ethernet PCIe, 3.84 TB KIOXIA KCMYXRUG3T84 NVMe®, Ubuntu 22.04.4 LTS, 6.5.0-35 generic (tuned-adm profile throughput-performance, ulimit -l 132065548, ulimit -n 1024, ulimit -s 8192), BIOS ESE122V (SMT=off, Determinism=Power, Turbo Boost = Enabled) Results: CPU Run 1 Run 2 Run 3 Median Relative Throughput Generational 2P Turin 192C, NPS1 1565.217 1537.367 1553.957 1553.957 3 2.41 2P Turin 128C, NPS1 1103.448 1138.34 1111.969 1111.969 2.147 1.725 2P Genoa 96C, NPS1 662.577 644.776 640.95 644.776 1.245 1 2P EMR 64C 517.986 421.053 553.846 517.986 1 NA Results may vary due to factors including system configurations, software versions and BIOS settings.
- 9xx5-011: FAISS (Requests/Hour) throughput results based on AMD internal testing as of 09/05/2024. FAISS Configurations: sift1m Data Set, 16 Core Instances, FP32, MKL 2024.2.1 2P AMD EPYC 9965 (384 Total Cores), 24 16C instances, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT1000C (SMT=off, Determinism=Power, Turbo Boost=Enabled), NPS=42P AMD EPYC 9654 (192 Total cores) 12 16C instances, 1.5TB 24x64GB DDR5-4800, 1DPC, 2 x 1.92 TB Samsung MZQL21T9HCJR-00A07 NVMe, Ubuntu 22.04.3 LTS, BIOS 1006C (SMT=off, Determinism=Power), NPS=4Versus 2P Xeon Platinum 8592+ (128 Total Cores), 8 16C instances, AMX On, 1TB 16x64GB DDR5-5600, 1DPC, 1.0 Gbps NetXtreme BCM5719 Gigabit Ethernet PCIe, 3.84 TB KIOXIA KCMYXRUG3T84 NVMe, , Ubuntu 22.04.4 LTS, 6.5.0-35 generic (tuned-adm profile throughput-performance, ulimit -l 132065548, ulimit -n 1024, ulimit -s 8192), BIOS ESE122V (SMT=off, Determinism=Power, Turbo Boost = Enabled) Results:CPU Median Relative Throughput Generational 2P Turin 192C 64.2 3.776 1.861 2P Genoa 96C 34.5 2.029 1 2P EMR 64C 17 1 NAResults may vary due to factors including system configurations, software versions and BIOS settings.
- 9xx5-056: Llama3.1-70B inference throughput results based on AMD internal testing as of 09/24/2024. Llama3.1-70B configurations: vLLM 0.8.0, TP8 Parallel, FP8, Input/Output token configurations (use cases): [128/128,128/2048, 2048/128, 2048/2048], continuous batching at 2000 prompts. Results in tokens/second. 2P AMD EPYC 9575F (128 Total Cores) with 8x AMD Instinct MI300X-NPS1-SPX-192GB-750W, GPU Interconnectivity XGMI, ROCm 6.2.0-66, 2304GB 24x96GB DDR5-6000, BIOS 1.0 (power determinism = off), Ubuntu 22.04.4 LTS, kernel 5.15.0-72-generic 2P Intel Xeon Platinum 8592+ (128 Total Cores) with 8x AMD Instinct MI300X-NPS1-SPX-192GB-750, GPU Interconnectivity XGMI, ROCm 6.2.0-66, 2048GB 32x64GB DDR5-4400, BIOS 2.0.4, (power determinism = off), Ubuntu 22.04.4 LTS, kernel 5.15.0-72-generic Input/Output Tokens MI300X Turin MI300X Emerald Rapids Turin vs. EMR 128/128 7739.32 7146.66 1.083 128/2048 9549.54 8536.45 1.119 2048/128 1399.82 1379.97 1.014 2048/2048 6330.81 5810.51 1.09 For average throughput increase of 1.076x. Results may vary due to factors including system configurations, software versions and BIOS settings.
- 9xx5-059A: Stable Diffusion XL v2 training results based on AMD internal testing as of 10/10/2024.
SDXL configurations: DeepSpeed 0.14.0, TP8 Parallel, FP8, batch size 24, results in seconds
2P AMD EPYC 9575F (128 Total Cores) with 8x AMD Instinct MI300X-NPS1-SPX-192GB-750W, GPU Interconnectivity XGMI, ROCm™ 6.2.0-66, 2304GB 24x96GB DDR5-6000, BIOS 1.0 (power determinism = off), Ubuntu® 22.04.4 LTS, kernel 5.15.0-72-generic, 334.80 seconds
2P Intel Xeon Platinum 8592+ (128 Total Cores) with 8x AMD Instinct MI300X-NPS1-SPX-192GB-750, GPU Interconnectivity XGMI, ROCm 6.2.0-66, 2048GB 32x64GB DDR5-4400, BIOS 2.0.4, (power determinism = off), Ubuntu 22.04.4 LTS, kernel 5.15.0-72-generic, 400.43 seconds
For 19.600% training performance increase.
Results may vary due to factors including system configurations, software versions and BIOS settings.
- 9xx5-005A: MySQL TPROC-C workload (SQL Server OLTP Brokerage) estimate based on internal AMD measurements as of 09/15/2024. The HammerDB TPROC-C workload is an open-source workload derived from TPC-Benchmark™ Standard, and as such is not comparable to published TPC-C TM results, as the results do not comply with the TPC-C Benchmark Standard. Workload configs: MySQL 8.0.39, 8 core nodes (Multi-SUT), HammerDB-4.4, duration 5min, 32 v users, warehouses 128, aggregate New Orders Per Minute (NOPM) 2P AMD EPYC 9965 powered server (384 total cores), 2.35TB Memory, BIOS RVC100DB, OS VMWare ESXi 8.0.3 build 70965425, 1x1.6TB and 10x3.84TB storage. VM Configurations: 8 cores/VM, 48 VMs, 48GB memory, Ubuntu 22.04.4 LTS, Linux 5.15.0-119-generic, BOOT_IMAGE=/vmlinuz-5.15.0-119-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro 2P AMD EPYC 9755 powered server (256 total cores), 2.35TB Memory, BIOS RVOT1000C, OS VMWare ESXi 8.0.3 build 70965425, 1x1.6TB and 8x3.84TB storage. VM Configurations: 8 cores/VM, 32 VMs, 48GB memory, Ubuntu 22.04.4 LTS, Linux 5.15.0-119-generic, BOOT_IMAGE=/vmlinuz-5.15.0-119-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro 2P AMD EPYC 9654 powered server (192 total cores), 1.5TB Memory, BIOS TVC100BD_2, OS VMWare ESXi 8.0.3 build 70965425, 1x1.6TB and 8x3.84TB storage. VM Configurations: 8 cores/VM, 24 VMs, 48GB memory, Ubuntu 22.04.4 LTS, Linux 5.15.0-119-generic, BOOT_IMAGE=/vmlinuz-5.15.0-119-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro spec_rstack_overflow=off 2P Intel Xeon 8592+ powered server (128 total cores), 1TB Memory, BIOS ESE124B, OS VMWare ESXi 8.0.3 build 24022510, 1x1.6TB and 8x3.84TB storage. VM Configurations: 8 cores/VM, 16 VMs, 48GB memory, Ubuntu 22.04.4 LTS, Linux 5.15.0-119-generic BOOT_IMAGE=/vmlinuz-5.15.0-119-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro spec_rstack_overflow=off CPU Score (TPM) Relative_8592+ Relative_9654 Intel 8592+ (64c) 9431248 1 0.523 AMD EPYC 9654 (96c) 18037794 1.913 1 AMD EPYC 9755 (128c) 32598005 3.456 1.807 AMD EPYC 9965 (192c) 36863796 3.909 2.043 Results may vary based on factors including but not limited to system configurations, software versions, and BIOS settings. TPC, TPC Benchmark, and TPC-C are trademarks of the Transaction Processing Performance Council.
- 9xx5-068: TPC Benchmark™ H @ 3000GB SF comparison based on published scores at tpc.org as of 10/10/2024. Configuration: 2P EPYC 9575F (3,401,383.1 QphH@3000GB, avail 10/10/2024, 128 total cores, www.tpc.org/3395) is 1.41x the QphH performance versus 2P AMD EPYC 9554 (2,405,162 QphH@3000GB, avail 10/01/2024, 128 total cores, www.tpc.org/3385). TPC, TPC Benchmark and TPC-H are trademarks of the Transaction Processing Performance Council.
- 9xx5-061: SPECpower_ssj® 2008 comparison based on published results from spec.org as of 10/10/2024.2P EPYC 9965 (35275 overall ssj_ops/w, 2U), 384 total cores, https://spec.org/power_ssj2008/results/res2022q4/power_ssj2008-20240923-01441.html2P EPYC 9654 (30602 overall ssj_ops/w, 2U), 192 total cores, https://spec.org/power_ssj2008/results/res2022q4/power_ssj2008-20221204-01204.html .Versus 2P Intel Xeon Platinum 8592+ (20408 35275 overall ssj_ops/w, 2U), 128 total cores, https://spec.org/power_ssj2008/results/res2024q2/power_ssj2008-20240422-01401.html .SPEC® and SPECpower_ssj® 2008 are registered trademarks of the Standard Performance Evaluation Corporation. See www.spec.org for more information.
- 9xx5-023: Source: https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/performance-briefs/amd-epyc-9005-pb-namd.pdf
- 9xx5-035A: AMD testing as of 10/03/2024.The detailed results show the average uplift of the performance metric (Elapsed Time) of this benchmark for a 2P 64-Core AMD EPYC™ 9575F powered system compared to a 2P 64-Core Intel® Xeon® PLATINUM 8592+ powered system running select tests on Ansys LS-DYNA. Uplifts for the performance metric normalized to the 64-Core Intel® Xeon® PLATINUM 8592+ follow for each benchmark:* Neon: ~1.68x* Car2Car: ~1.72x* 3 Cars: ~1.49x* ODB 10m: ~1.63xSystem ConfigurationsCPU: 2P 64-Core Intel® Xeon® PLATINUM 8592+ (128 total cores)Memory: 16x 64 GB DDR5-5600Storage: KIOXIA KCMYXRUG3T84Platform and BIOS: ESE122V-3.10BIOS Options: SMT=Off High Performance ModeOS: rhel 9.4 5.14.0-427.16.1.el9_4.x86_64Kernel Options: processor.max_cstate=1 intel_idle.max_cstate=0 iommu=pt mitigations=offRuntime Options: cpupower frequency-set -g performance echo 3 > /proc/sys/vm/drop_caches echo 0 > /proc/sys/kernel/nmi_watchdog echo 0 > /proc/sys/kernel/numa_balancing echo 0 > /proc/sys/kernel/randomize_va_space echo 'always' > /sys/kernel/mm/transparent_hugepage/enabled echo 'always' > /sys/kernel/mm/transparent_hugepage/defragCPU: 2P 64-Core AMD EPYC™ 9575F (128 total cores)Memory: 24x 64 GB DDR5-6000Storage: SAMSUNG MZWLO3T8HCLS-00A07Platform and BIOS: None RVOT1000CBIOS Options: SMT=Off NPS=4 Power Determinism ModeOS: rhel 9.4 5.14.0-427.16.1.el9_4.x86_64Kernel Options: amd_iommu=on iommu=pt mitigations=offRuntime Options: cpupower idle-set -d 2 cpupower frequency-set -g performance echo 3 > /proc/sys/vm/drop_caches echo 0 > /proc/sys/kernel/nmi_watchdog echo 0 > /proc/sys/kernel/numa_balancing echo 0 > /proc/sys/kernel/randomize_va_space echo 'always' > /sys/kernel/mm/transparent_hugepage/enabled echo 'always' > /sys/kernel/mm/transparent_hugepage/defragResults may vary based on system configurations, software versions, and BIOS settings. ANSYS, LS-DYNA and any and all ANSYS, Inc. brand, product, service and feature names, logos and slogans are registered trademarks or trademarks of ANSYS, Inc. or its subsidiaries in the United States or other countries. LS-DYNA is a registered trademark of Livermore Software Technology Corporation.
- 9XX5-007 : V-Ray based on AMD internal testing as of 09/01/2024. System Configurations: 2P AMD EPYC™ 9965 reference system (2 x 192C) 1.5TB 24x64GB DDR5-6400 running at 6000MT/s, BIOS RVOT1000C (determinism enable=power), 476GB NVME, Ubuntu 22.04.4 LTS, Kernel Linux 6.8.0-40-generic, , 329,847.67 average vsamples2P AMD EPYC™ 9654 system (2 x 96C, 1.5TB 24x64GB DDR5-4800, BIOS TTI100BA (determinism enable=power), SAMSUNG MO003200KYDNC, Ubuntu 22.04.4 LTS, Kernel Linux 6.8.0-40-generic, 204,200.00 average vsamples2P Intel Xeon Platinum 8592+ system (2 x 64C, 1TB 16x64GB DDR5-5600, BIOS ESE124B-3.11, 3.2 TB NVME, Ubuntu 22.04.3 LTS, Kernel Linux 6.5.0-35-generic), 144,452.67 average vsamples For ~2.3x the performance when comparing the EPYC 9965 to Xeon Platinum 8592+ SystemsFor 1.4x the performance when comparing the EPYC 9654 to Xeon Platinum 8592+ SystemsChaos®, V-Ray® and Phoenix FD® are registered trademarks of Chaos Software EOOD in Bulgaria and/or other countries.
Footnotes
- 9xx5-012: TPCxAI @SF30 Multi-Instance 32C Instance Size throughput results based on AMD internal testing as of 09/05/2024 running multiple VM instances. The aggregate end-to-end AI throughput test is derived from the TPCx-AI benchmark and as such is not comparable to published TPCx-AI results, as the end-to-end AI throughput test results do not comply with the TPCx-AI Specification. 2P AMD EPYC 9965 (384 Total Cores), 12 32C instances, NPS1, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT1000C (SMT=off, Determinism=Power, Turbo Boost=Enabled); 2P AMD EPYC 9755 (256 Total Cores), 8 32C instances, NPS1, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT0090F (SMT=off, Determinism=Power, Turbo Boost=Enabled); 2P AMD EPYC 9654 (192 Total cores) 6 32C instances, NPS1, 1.5TB 24x64GB DDR5-4800, 1DPC, 2 x 1.92 TB Samsung MZQL21T9HCJR-00A07 NVMe, Ubuntu 22.04.3 LTS, BIOS 1006C (SMT=off, Determinism=Power); Versus 2P Xeon Platinum 8592+ (128 Total Cores), 4 32C instances, AMX On, 1TB 16x64GB DDR5-5600, 1DPC, 1.0 Gbps NetXtreme BCM5719 Gigabit Ethernet PCIe, 3.84 TB KIOXIA KCMYXRUG3T84 NVMe, , Ubuntu 22.04.4 LTS, 6.5.0-35 generic (tuned-adm profile throughput-performance, ulimit -l 132065548, ulimit -n 1024, ulimit -s 8192), BIOS ESE122V (SMT=off, Determinism=Power, Turbo Boost = Enabled). Results may vary due to factors including system configurations, software versions and BIOS settings. TPC, TPC Benchmark and TPC-C are trademarks of the Transaction Processing Performance Council.
- 9xx5-009: Llama3.1-8B throughput results based on AMD internal testing as of 09/05/2024. Llama3-8B configurations: IPEX.LLM 2.4.0, NPS=2, BF16, batch size 4, Use Case Input/Output token configurations: [Summary = 1024/128, Chatbot = 128/128, Translate = 1024/1024, Essay = 128/1024, Caption = 16/16].
2P AMD EPYC 9965 (384 Total Cores), 6 64C instances 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1 DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.3 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192) , BIOS RVOT1000C, (SMT=off, Determinism=Power, Turbo Boost=Enabled), NPS=2; 2P AMD EPYC 9755 (256 Total Cores), 4 64C instances , 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu 22.04.3 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT1000C (SMT=off, Determinism=Power, Turbo Boost=Enabled), NPS=2; 2P AMD EPYC 9654 (192 Total Cores) 4 48C instances , 1.5TB 24x64GB DDR5-4800, 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.4 LTS, 5.15.85-051585-generic (tuned-adm profile throughput-performance, ulimit -l 1198117616, ulimit -n 500000, ulimit -s 8192), BIOS RVI1008C (SMT=off, Determinism=Power, Turbo Boost=Enabled), NPS=2; Versus 2P Xeon Platinum 8592+ (128 Total Cores), 2 64C instances , AMX On, 1TB 16x64GB DDR5-5600, 1DPC, 1.0 Gbps NetXtreme BCM5719 Gigabit Ethernet PCIe, 3.84 TB KIOXIA KCMYXRUG3T84 NVMe®, Ubuntu 22.04.4 LTS 6.5.0-35-generic (tuned-adm profile throughput-performance, ulimit -l 132065548, ulimit -n 1024, ulimit -s 8192), BIOS ESE122V (SMT=off, Determinism=Power, Turbo Boost = Enabled).
Results may vary due to factors including system configurations, software versions and BIOS settings. - 9xx5-040A: XGBoost (Runs/Hour) throughput results based on AMD internal testing as of 09/05/2024. XGBoost Configurations: v2.2.1, Higgs Data Set, 32 Core Instances, FP32 2P AMD EPYC 9965 (384 Total Cores), 12 x 32 core instances, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.4 LTS, 6.8.0-45-generic (tuned-adm profile throughput-performance, ulimit -l 198078840, ulimit -n 1024, ulimit -s 8192), BIOS RVOT1000C (SMT=off, Determinism=Power, Turbo Boost=Enabled), NPS=1 2P AMD EPYC 9755 (256 Total Cores), 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198094956, ulimit -n 1024, ulimit -s 8192), BIOS RVOT0090F (SMT=off, Determinism=Power, Turbo Boost=Enabled), NPS=1 2P AMD EPYC 9654 (192 Total cores), 1.5TB 24x64GB DDR5-4800, 1DPC, 2 x 1.92 TB Samsung MZQL21T9HCJR-00A07 NVMe®, Ubuntu 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198120988, ulimit -n 1024, ulimit -s 8192), BIOS TTI100BA (SMT=off, Determinism=Power), NPS=1 Versus 2P Xeon Platinum 8592+ (128 Total Cores), AMX On, 1TB 16x64GB DDR5-5600, 1DPC, 1.0 Gbps NetXtreme BCM5719 Gigabit Ethernet PCIe, 3.84 TB KIOXIA KCMYXRUG3T84 NVMe®, Ubuntu 22.04.4 LTS, 6.5.0-35 generic (tuned-adm profile throughput-performance, ulimit -l 132065548, ulimit -n 1024, ulimit -s 8192), BIOS ESE122V (SMT=off, Determinism=Power, Turbo Boost = Enabled) Results: CPU Run 1 Run 2 Run 3 Median Relative Throughput Generational 2P Turin 192C, NPS1 1565.217 1537.367 1553.957 1553.957 3 2.41 2P Turin 128C, NPS1 1103.448 1138.34 1111.969 1111.969 2.147 1.725 2P Genoa 96C, NPS1 662.577 644.776 640.95 644.776 1.245 1 2P EMR 64C 517.986 421.053 553.846 517.986 1 NA Results may vary due to factors including system configurations, software versions and BIOS settings.
- 9xx5-011: FAISS (Requests/Hour) throughput results based on AMD internal testing as of 09/05/2024. FAISS Configurations: sift1m Data Set, 16 Core Instances, FP32, MKL 2024.2.1 2P AMD EPYC 9965 (384 Total Cores), 24 16C instances, 1.5TB 24x64GB DDR5-6400 (at 6000 MT/s), 1DPC, 1.0 Gbps NetXtreme BCM5720 Gigabit Ethernet PCIe, 3.5 TB Samsung MZWLO3T8HCLS-00A07 NVMe®, Ubuntu® 22.04.4 LTS, 6.8.0-40-generic (tuned-adm profile throughput-performance, ulimit -l 198096812, ulimit -n 1024, ulimit -s 8192), BIOS RVOT1000C (SMT=off, Determinism=Power, Turbo Boost=Enabled), NPS=42P AMD EPYC 9654 (192 Total cores) 12 16C instances, 1.5TB 24x64GB DDR5-4800, 1DPC, 2 x 1.92 TB Samsung MZQL21T9HCJR-00A07 NVMe, Ubuntu 22.04.3 LTS, BIOS 1006C (SMT=off, Determinism=Power), NPS=4Versus 2P Xeon Platinum 8592+ (128 Total Cores), 8 16C instances, AMX On, 1TB 16x64GB DDR5-5600, 1DPC, 1.0 Gbps NetXtreme BCM5719 Gigabit Ethernet PCIe, 3.84 TB KIOXIA KCMYXRUG3T84 NVMe, , Ubuntu 22.04.4 LTS, 6.5.0-35 generic (tuned-adm profile throughput-performance, ulimit -l 132065548, ulimit -n 1024, ulimit -s 8192), BIOS ESE122V (SMT=off, Determinism=Power, Turbo Boost = Enabled) Results:CPU Median Relative Throughput Generational 2P Turin 192C 64.2 3.776 1.861 2P Genoa 96C 34.5 2.029 1 2P EMR 64C 17 1 NAResults may vary due to factors including system configurations, software versions and BIOS settings.
- 9xx5-056: Llama3.1-70B inference throughput results based on AMD internal testing as of 09/24/2024. Llama3.1-70B configurations: vLLM 0.8.0, TP8 Parallel, FP8, Input/Output token configurations (use cases): [128/128,128/2048, 2048/128, 2048/2048], continuous batching at 2000 prompts. Results in tokens/second. 2P AMD EPYC 9575F (128 Total Cores) with 8x AMD Instinct MI300X-NPS1-SPX-192GB-750W, GPU Interconnectivity XGMI, ROCm 6.2.0-66, 2304GB 24x96GB DDR5-6000, BIOS 1.0 (power determinism = off), Ubuntu 22.04.4 LTS, kernel 5.15.0-72-generic 2P Intel Xeon Platinum 8592+ (128 Total Cores) with 8x AMD Instinct MI300X-NPS1-SPX-192GB-750, GPU Interconnectivity XGMI, ROCm 6.2.0-66, 2048GB 32x64GB DDR5-4400, BIOS 2.0.4, (power determinism = off), Ubuntu 22.04.4 LTS, kernel 5.15.0-72-generic Input/Output Tokens MI300X Turin MI300X Emerald Rapids Turin vs. EMR 128/128 7739.32 7146.66 1.083 128/2048 9549.54 8536.45 1.119 2048/128 1399.82 1379.97 1.014 2048/2048 6330.81 5810.51 1.09 For average throughput increase of 1.076x. Results may vary due to factors including system configurations, software versions and BIOS settings.
- 9xx5-059A: Stable Diffusion XL v2 training results based on AMD internal testing as of 10/10/2024.
SDXL configurations: DeepSpeed 0.14.0, TP8 Parallel, FP8, batch size 24, results in seconds
2P AMD EPYC 9575F (128 Total Cores) with 8x AMD Instinct MI300X-NPS1-SPX-192GB-750W, GPU Interconnectivity XGMI, ROCm™ 6.2.0-66, 2304GB 24x96GB DDR5-6000, BIOS 1.0 (power determinism = off), Ubuntu® 22.04.4 LTS, kernel 5.15.0-72-generic, 334.80 seconds
2P Intel Xeon Platinum 8592+ (128 Total Cores) with 8x AMD Instinct MI300X-NPS1-SPX-192GB-750, GPU Interconnectivity XGMI, ROCm 6.2.0-66, 2048GB 32x64GB DDR5-4400, BIOS 2.0.4, (power determinism = off), Ubuntu 22.04.4 LTS, kernel 5.15.0-72-generic, 400.43 seconds
For 19.600% training performance increase.
Results may vary due to factors including system configurations, software versions and BIOS settings. - 9xx5-005A: MySQL TPROC-C workload (SQL Server OLTP Brokerage) estimate based on internal AMD measurements as of 09/15/2024. The HammerDB TPROC-C workload is an open-source workload derived from TPC-Benchmark™ Standard, and as such is not comparable to published TPC-C TM results, as the results do not comply with the TPC-C Benchmark Standard. Workload configs: MySQL 8.0.39, 8 core nodes (Multi-SUT), HammerDB-4.4, duration 5min, 32 v users, warehouses 128, aggregate New Orders Per Minute (NOPM) 2P AMD EPYC 9965 powered server (384 total cores), 2.35TB Memory, BIOS RVC100DB, OS VMWare ESXi 8.0.3 build 70965425, 1x1.6TB and 10x3.84TB storage. VM Configurations: 8 cores/VM, 48 VMs, 48GB memory, Ubuntu 22.04.4 LTS, Linux 5.15.0-119-generic, BOOT_IMAGE=/vmlinuz-5.15.0-119-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro 2P AMD EPYC 9755 powered server (256 total cores), 2.35TB Memory, BIOS RVOT1000C, OS VMWare ESXi 8.0.3 build 70965425, 1x1.6TB and 8x3.84TB storage. VM Configurations: 8 cores/VM, 32 VMs, 48GB memory, Ubuntu 22.04.4 LTS, Linux 5.15.0-119-generic, BOOT_IMAGE=/vmlinuz-5.15.0-119-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro 2P AMD EPYC 9654 powered server (192 total cores), 1.5TB Memory, BIOS TVC100BD_2, OS VMWare ESXi 8.0.3 build 70965425, 1x1.6TB and 8x3.84TB storage. VM Configurations: 8 cores/VM, 24 VMs, 48GB memory, Ubuntu 22.04.4 LTS, Linux 5.15.0-119-generic, BOOT_IMAGE=/vmlinuz-5.15.0-119-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro spec_rstack_overflow=off 2P Intel Xeon 8592+ powered server (128 total cores), 1TB Memory, BIOS ESE124B, OS VMWare ESXi 8.0.3 build 24022510, 1x1.6TB and 8x3.84TB storage. VM Configurations: 8 cores/VM, 16 VMs, 48GB memory, Ubuntu 22.04.4 LTS, Linux 5.15.0-119-generic BOOT_IMAGE=/vmlinuz-5.15.0-119-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro spec_rstack_overflow=off CPU Score (TPM) Relative_8592+ Relative_9654 Intel 8592+ (64c) 9431248 1 0.523 AMD EPYC 9654 (96c) 18037794 1.913 1 AMD EPYC 9755 (128c) 32598005 3.456 1.807 AMD EPYC 9965 (192c) 36863796 3.909 2.043 Results may vary based on factors including but not limited to system configurations, software versions, and BIOS settings. TPC, TPC Benchmark, and TPC-C are trademarks of the Transaction Processing Performance Council.
- 9xx5-068: TPC Benchmark™ H @ 3000GB SF comparison based on published scores at tpc.org as of 10/10/2024. Configuration: 2P EPYC 9575F (3,401,383.1 QphH@3000GB, avail 10/10/2024, 128 total cores, www.tpc.org/3395) is 1.41x the QphH performance versus 2P AMD EPYC 9554 (2,405,162 QphH@3000GB, avail 10/01/2024, 128 total cores, www.tpc.org/3385). TPC, TPC Benchmark and TPC-H are trademarks of the Transaction Processing Performance Council.
- 9xx5-061: SPECpower_ssj® 2008 comparison based on published results from spec.org as of 10/10/2024.2P EPYC 9965 (35275 overall ssj_ops/w, 2U), 384 total cores, https://spec.org/power_ssj2008/results/res2022q4/power_ssj2008-20240923-01441.html2P EPYC 9654 (30602 overall ssj_ops/w, 2U), 192 total cores, https://spec.org/power_ssj2008/results/res2022q4/power_ssj2008-20221204-01204.html .Versus 2P Intel Xeon Platinum 8592+ (20408 35275 overall ssj_ops/w, 2U), 128 total cores, https://spec.org/power_ssj2008/results/res2024q2/power_ssj2008-20240422-01401.html .SPEC® and SPECpower_ssj® 2008 are registered trademarks of the Standard Performance Evaluation Corporation. See www.spec.org for more information.
- 9xx5-023: Source: https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/performance-briefs/amd-epyc-9005-pb-namd.pdf
- 9xx5-035A: AMD testing as of 10/03/2024.The detailed results show the average uplift of the performance metric (Elapsed Time) of this benchmark for a 2P 64-Core AMD EPYC™ 9575F powered system compared to a 2P 64-Core Intel® Xeon® PLATINUM 8592+ powered system running select tests on Ansys LS-DYNA. Uplifts for the performance metric normalized to the 64-Core Intel® Xeon® PLATINUM 8592+ follow for each benchmark:* Neon: ~1.68x* Car2Car: ~1.72x* 3 Cars: ~1.49x* ODB 10m: ~1.63xSystem ConfigurationsCPU: 2P 64-Core Intel® Xeon® PLATINUM 8592+ (128 total cores)Memory: 16x 64 GB DDR5-5600Storage: KIOXIA KCMYXRUG3T84Platform and BIOS: ESE122V-3.10BIOS Options: SMT=Off High Performance ModeOS: rhel 9.4 5.14.0-427.16.1.el9_4.x86_64Kernel Options: processor.max_cstate=1 intel_idle.max_cstate=0 iommu=pt mitigations=offRuntime Options: cpupower frequency-set -g performance echo 3 > /proc/sys/vm/drop_caches echo 0 > /proc/sys/kernel/nmi_watchdog echo 0 > /proc/sys/kernel/numa_balancing echo 0 > /proc/sys/kernel/randomize_va_space echo 'always' > /sys/kernel/mm/transparent_hugepage/enabled echo 'always' > /sys/kernel/mm/transparent_hugepage/defragCPU: 2P 64-Core AMD EPYC™ 9575F (128 total cores)Memory: 24x 64 GB DDR5-6000Storage: SAMSUNG MZWLO3T8HCLS-00A07Platform and BIOS: None RVOT1000CBIOS Options: SMT=Off NPS=4 Power Determinism ModeOS: rhel 9.4 5.14.0-427.16.1.el9_4.x86_64Kernel Options: amd_iommu=on iommu=pt mitigations=offRuntime Options: cpupower idle-set -d 2 cpupower frequency-set -g performance echo 3 > /proc/sys/vm/drop_caches echo 0 > /proc/sys/kernel/nmi_watchdog echo 0 > /proc/sys/kernel/numa_balancing echo 0 > /proc/sys/kernel/randomize_va_space echo 'always' > /sys/kernel/mm/transparent_hugepage/enabled echo 'always' > /sys/kernel/mm/transparent_hugepage/defragResults may vary based on system configurations, software versions, and BIOS settings. ANSYS, LS-DYNA and any and all ANSYS, Inc. brand, product, service and feature names, logos and slogans are registered trademarks or trademarks of ANSYS, Inc. or its subsidiaries in the United States or other countries. LS-DYNA is a registered trademark of Livermore Software Technology Corporation.
- 9XX5-007 : V-Ray based on AMD internal testing as of 09/01/2024. System Configurations: 2P AMD EPYC™ 9965 reference system (2 x 192C) 1.5TB 24x64GB DDR5-6400 running at 6000MT/s, BIOS RVOT1000C (determinism enable=power), 476GB NVME, Ubuntu 22.04.4 LTS, Kernel Linux 6.8.0-40-generic, , 329,847.67 average vsamples2P AMD EPYC™ 9654 system (2 x 96C, 1.5TB 24x64GB DDR5-4800, BIOS TTI100BA (determinism enable=power), SAMSUNG MO003200KYDNC, Ubuntu 22.04.4 LTS, Kernel Linux 6.8.0-40-generic, 204,200.00 average vsamples2P Intel Xeon Platinum 8592+ system (2 x 64C, 1TB 16x64GB DDR5-5600, BIOS ESE124B-3.11, 3.2 TB NVME, Ubuntu 22.04.3 LTS, Kernel Linux 6.5.0-35-generic), 144,452.67 average vsamples For ~2.3x the performance when comparing the EPYC 9965 to Xeon Platinum 8592+ SystemsFor 1.4x the performance when comparing the EPYC 9654 to Xeon Platinum 8592+ SystemsChaos®, V-Ray® and Phoenix FD® are registered trademarks of Chaos Software EOOD in Bulgaria and/or other countries.