AMD FirePro™ S9300 x2 Server GPU

AMD FirePro™ S9300 Server GPU

​The World’s First GPU Accelerator with 1TB/s Memory Bandwidth

Accelerate your most complex HPC workloads in data an​alytics or seismic processing on the world’s fastest single-precision compute GPU accelerator, the AMD FirePro™ S9300 x2 Server GPU.2,4

 Overview

​The new AMD FirePro™ S9300 x2 Server GPU is the world’s first professional GPU accelerator to be equipped with high bandwidth memory (HBM) and the first accelerator compatible with all AMD’s GPUOpen Professional Compute tools and libraries. HBM allows the AMD FirePro S9300 x2 Server GPU to exceed the competition with 3.5x the memory bandwidth of NVIDIA’s Tesla M40 and 2.1x the memory bandwidth of NVIDIA’s Tesla K803.

Based on third generation AMD Graphics Core Next (GCN) architecture, the AMD FirePro S9300 x2 server GPU delivers up to 13.9 TFLOPS of peak single-precision floating point performance – more than any other GPU accelerator available on the market today for single-precision compute4. Compared to Intel’s flagship Xeon E5 CPU, the raw performance advantage of the FirePro™ S9300 x2 GPU is even more dramatic – over 15X the memory bandwidth and over 12X the peak single precision performance6.

A great accelerator is not complete without having a great developer ecosystem. With AMD’s GPUOpen Professional Compute software stack, the AMD FirePro S9300 x2 Server GPU utilizes AMD’s first open source Linux® driver built specifically for compute, as well as support for acceleration using C++ in addition to OpenCL™. Another benefit for those who have code in CUDA, is the ability to easily port the majority of their code over to C++, giving companies the freedom to choose between vendors.

 Benefits

  • ​The AMD FirePro™ S9300 x2 Server GPU offers the highest single precision floating point performance of any GPU accelerator4.
  • The AMD FirePro™ S9300 x2 Server GPU is the world’s first and only professional GPU equipped with high bandwidth (HBM) memory1.
  • The AMD FirePro™ S9300 x2 Server GPU supports AMD’s GPUOpen software stack, allowing developers to code and compile in C++ or OpenCL™.

 Features

​Radeon Open Compute Platform (ROCm)

Comprised of an open-source Linux® driver optimized for compute, support for GPU acceleration using a new compiler to process code written in the C++ programming language, and other developer tools such as the Heterogeneous-compute Interface for Portability (HIP) Tool to port code written for CUDA to C++.

ROCm is built for scale; it supports multi-GPU peer-to-peer computing including communication through RDMA.

ROCm has a rich system run time with the critical features that large-scale application, compiler and language-run-time development requires.

HSA Foundation 

HSA Compliant Runtime and Driver for AMD Radeon™ and FirePro™ GPU’s

Heterogeneous-compute Interface for Portability (HIP) Tool

Easily convert your code to C++ with this free, open source tool, while maintaining compatibility with CUDA compilers. The HIP tool allows developers to port the majority of their CUDA code over to C++ in a snap. Get started on the AMD FirePro S9300 x2 GPU, an open-source friendly accelerator from AMD, today.

OpenCL™ 1.2 Support

Helps professionals tap into the parallel computing power of modern GPUs and multicore CPUs to accelerate compute-intensive tasks in leading CAD/CAM/CAE and Media & Entertainment applications that support OpenCL. The AMD FirePro S9300 x2 Server GPU supports OpenCL™ 1.2, allowing developers to take advantage of new features that give GPUs more freedom to do the work they are designed to do. 

13.9 TFLOPS of Peak Single Precision

Helps speed up time required to complete single precision floating point operations used within Simulations, Video Enhancement, Signal Processing, Video Transcoding and Digital Rendering applications where high performance takes precedence over accuracy. With the AMD FirePro™ S9300 x2 delivering 13.9 TFLOPS of peak single precision compute performance, one can configure a 2P server with 8 GPUs to achieve over 111 TFLOPS of peak single precision compute performance. In a standard 42U rack with 10x 4U servers, that’s potentially over 1 PFLOP of single precision compute performance!

870 GFLOPS of Peak Double Precision

Helps speed up time required to complete double precision floating point operations used within Computational Fluid Dynamics, Structural Mechanics, Reservoir Simulation and Aerodynamics applications, where numerical precision is mission critical.

Half Precision (FP16) Support

Developers who do not need the accuracy of 32-bit mathematical operations can now use 16-bit operations to help achieve high performance through a more efficient use of memory bandwidth and reduced memory footprint.

8GB HBM Memory

HBM is a new type of memory design with low power consumption and ultra-wide communication lanes. It uses vertically stacked memory chips interconnected by microscopic wires called “through-silicon vias,” or TSVs, placed directly onto the interposer, shortening the distance information has to travel between memory and processor.

AMD PowerTune

AMD PowerTune Technology is an intelligent power management system that monitors both GPU activity and power draw. AMD PowerTune optimizes the GPU to deliver low power draw when GPU workloads do not demand full activity and delivers the optimal clock speed to ensure the highest possible performance within the GPU’s power budget for high intensity workloads.5

 Specs

​Cooling/Power/Form Factor

  • Max Power: 300W
  • Bus Interface: PCIe® Gen 3 x16
  • Form Factor: Dual Slot, Full Length, Full Height
  • Cooling: Passive

Memory

  • Size/Type: 8GB HBM
  • Bandwidth: 1TB/s (2x 512GB/s)

API and OS Support

  • OpenCL™ 1.2
  • HC (Heterogeneous Compute)
  • C++ AMP
  • Linux® 64-bit

Enabled AMD Technologies

  • AMD PowerTune technology5

System Requirements

  • PCI Express® based server with one available x16 lane slot. AMD recommends PCI Express® v3.0 for optimal performance
  • Power supply with two PCIe 8-pin aux power connectors
  • Airflow through GPU of at least 25CFM, max inlet temperature 45C
  • Minimum 16GB DDR3/DDR4 system memory recommended

Warranty and Support

  • Three-year limited product repair/replacement warranty
  • Direct toll-free phone and email access to dedicated workstation technical support team7
  • Advanced parts replacement option

 Resources

GPUOpen Professional Compute

GPUOpen Professional Compute is designed to empower all types of developers to accelerate the implementation of their vision and help solve their biggest challenges in instinctive and high-performance GPU computing through optimized open-source driver/runtimes and standards-based languages, libraries and applications.

Learn More

 Customer Spotlight: CGG

 

​CGG is a leader in cutting-edge geoscience. CGG has achieved leadership through a strong focus on innovation and a commitment to delivering the best sustainable solutions to their clients' energy challenges. They bring their clients a unique range of technologies, services and equipment designed to acquire extremely precise data and images of the Earth's subsurface. CGG also provides state-of-the-art software and services for analyzing that data and developing a deeper understanding of the subsurface for exploration, production and optimization of oil and gas reservoirs.

CGG recently conducted proprietary wave equation modelling benchmarking on several different GPU accelerators, including the new AMD FirePro™ S9300 x2 GPU. As the complexity of the wave equation increased, the performance advantage also grew in favor of the AMD FirePro™ S9300 x2, to a point where it was 2x faster than any other card tested8.

 

Chart Provided by CGG 

“We’re very pleased with the AMD FirePro™ compute clusters,” said Jean-Yves Blanc, Chief IT Architect, CGG. “We’re also impressed by the 1TB/s memory bandwidth of the AMD FirePro S9300 x2, a board which delivers over 2x the performance of any other server GPU boards on CGG Wave Equation Modeling codes”.

CGG’s Marc Tchiboukdjian recently gave a presentation at Rice University on the GPUWrapper, a portable API for heterogeneous computing. Watch Marc’s presentation by clicking on the link below.

Watch Now

 

CGG installed its first rack of oil immersion cooled compute systems in June 2011. Over time, CGG learned several lessons from implementing this solution, including: actual cost savings (CapEx, OpEx), equipment failure rates, thermal performance, and operational issues. Find out more about CGG’s oil immersion cooling at their seismic processing data center via the video link below.

Watch Now

 Drivers

  

Footnotes