Empowering a new era of scale-out compute for HPC and Deep Learning
Truly accelerating the pace of deep learning and addressing the broad needs of the datacenter requires a combination of high performance compute and GPU acceleration optimized for handling massive amounts of data with lots of floating point computation that can be spread across many cores. Large system designers today also need the ability design efficient systems with the flexibility and openness to configure systems that meet the challenge of today’s very demanding workloads.
AMD is empowering designers with those capabilities, allowing them to raise the bar on achievable compute densities by enabling optimized server designs with higher performance, reduced latencies and improved efficiencies in an open, flexible environment. With the introduction of new EPYC processor based servers with Radeon Instinct GPU accelerators, combined with our ROCm open software platform, AMD is ushering in a new era of heterogeneous compute for HPC and Deep Learning.
Radeon Instinct™ MI25 Server Accelerators
AMD is changing the game with the introduction of its open standards-based Radeon Instinct family of products. Radeon Instinct accelerators, combined with our open ecosystem approach to heterogeneous compute, raises the bar on achievable performance, efficiencies and the flexibility needed to design systems capable of meeting the challenges of today’s data-centric workloads.
The new Radeon Instinct MI25 accelerator, based on AMD’s Next-Gen “Vega” architecture, with its powerful parallel compute engine, is the world’s ultimate training accelerator for large scale deep learning applications and is a workhorse for HPC workloads delivering 24.6 TFLOPS of FP16 and 12.3 TFLOPS of FP32 peak floating-point performance.1 Combine this power with the open ROCm software platform and the world’s most advanced GPU memory architecture, 16GB of HBM2, and up to 484 GB/s of memory bandwidth, and you get the ultimate solution for today’s compute workloads.
Radeon Instinct MI25 Highlights:
Built-on AMD’s Next-Gen “Vega” architecture with world’s most advanced GPU memory architecture
Superior FP16 and FP32 performance for HPC and Deep Learning
ROCm open software platform for HPC-class rack scale
Large BAR support for mGPU peer to peer
MxGPU SR-IOV hardware virtualization technologies for optimized datacenter utilization
Superior compute density and performance per node when combining new AMD EPYC™ processor-based servers and Radeon Instinct MI25 accelerators
ROCm Open Software Platform
The ROCm open software platform delivers an open-source foundation for HPC-class heterogeneous compute and world-class datacenter system designs. The ROCm platform provides performance optimized Linux® drivers, compilers, tools and libraries. ROCm’s software design philosophy offers programing choice, minimalism and a modular software development approach to allow for more optimized GPU accelerator computing.
Combined this approach with AMD’s secure hardware virtualized MxGPU technologies, and system designers are now enabled to change how they design systems to achieve higher efficiencies and to drive optimized datacenter utilization and capacities.
ROCm foundational elements:
Open Headless Linux® 64-bit driver and rich system runtime stack optimized for Hyperscale & HPC-class compute
Multi-GPU compute supporting in and out of server-node communication through RDMA with direct RDMA peer-sync support in driver
Simpler programming model giving developers control when needed
HCC true single-source C++ heterogeneous compilers addressing whole system not just a single device
HIP CUDA conversion tool providing platform choice for using GPU computing API
The ROCm open software platform provides a solid foundation for large scale Machine Intelligent and HPC datacenter deployments with an optimized open Linux driver and rich ROCr System Runtime which is language independent and makes heavy use of the Heterogeneous System Architecture (HSA) Runtime API. This provides a rich foundation to execute programming languages such as HCC C++, Khronos Group’s OpenCL™, Continuum’s Anaconda Python and the HIP CUDA conversion tool.2
AMD continues to embrace an open approach to extend support of critical features required for NUMA class acceleration to our Radeon™ GPU accelerators for HPC and deep learning deployments, and the ROCm platform now supports our new Radeon Instinct GPU accelerator family of products, as well as continued support for a number of our other AMD FirePro™ S Series, Radeon™ RX Series, and Radeon™ Pro Duo graphics cards. Please visit the ROCm web site for a full list of supported GPU cards.
OpenCL™, OpenMP and OpenACC Support
AMD continues to support these standards on our product offerings3. We believe that most people in the HPC community want open standards as the de facto way of running their projects and simulations, and AMD is committed to supporting this goal and is working extensively with the community to drive open standards forward.