RAD Open-Source Projects

AMD drives innovation through open-source contributions—empowering developers with tools for high-performance GPU and CPU computing and inviting collaboration to shape future systems.

ACCL

ACCL provides MPI-style collective communication for Xilinx FPGAs through a Vitis kernel and XRT drivers, enabling fast, scalable data movement.

Astra-Sim

Astra-Sim is a leading distributed ML system simulator, enhanced by AMD to more accurately model the collective communication algorithms generated by MSCCL++.

AUP AI Tutorials

A broad set of AMD AI notebooks spanning the full development cycle, organized into five areas: getting started, model design, specialization, optimization, and serving.

Brevitas

Brevitas is a PyTorch library enabling flexible neural network quantization, supporting both post-training (PTQ) and quantization-aware training (QAT).

Chakra

Chakra is an open, portable benchmarking and co-design ecosystem using graph-based Execution Traces. AMD enhanced the toolkit and schema for MI Instinct GPU compatibility.

FINN

FINN is a research framework for AI dataflow inference on FPGAs, using Brevitas for quantization and supporting CNNs, residual nets, and emerging transformer models.

gem5

gem5 is the world’s most widely used architecture simulator, co-led by AMD Research, which continues to advance the only fully open-source model of MI Instinct GPUs.

GeniePIM

AMD GeniePIM is a PIM-based analytical model for GenAI, estimating GEMV performance on emerging PIM architectures and comparing speedups, timing, and configurations to host GPUs.

Iris

Iris is a Triton-based framework for Remote Memory Access, developed by AMD RAD, providing SHMEM-like APIs in Triton to enable efficient multi-GPU programming.

IRON

IRON is an open-source, close-to-metal Python API for fast, efficient execution on AMD Ryzen™ AI NPUs, built on language bindings for the MLIR-AIE dialect.

LogicNets

LogicNets is a methodology for designing, training, and deploying sparse, quantized neural networks built from hardware-friendly building blocks for efficient inference.

NPUEval

NPUEval is an LLM evaluation dataset designed to target AIE kernel code generation on Ryzen™ AI hardware, enabling accurate benchmarking of NPU-focused models.

Omnistat

Omnistat offers utilities for aggregating scale-out system metrics through low-overhead sampling across entire clusters or subsets of hosts tied to a user’s job.

Omnitrace

Omnitrace is a comprehensive profiling and tracing tool for parallel C, C++, Fortran, HIP, OpenCL, and Python applications running on CPUs or hybrid CPU+GPU systems.

OpenNIC

The OpenNIC project offers an FPGA-based NIC platform for the open-source community, featuring a NIC shell along with Linux kernel and DPDK drivers.

P2P

P2P enables efficient data transfers between AMD GPUs and FPGAs over PCIe without using host memory, a capability now upstreamed into ETH Zürich’s Coyote runtime.

P4AI

P4AI is a framework for rapid prototyping of DNN-powered SmartNIC solutions, using automated code generation to build high-performance designs on AMD Alveo™ cards.

PACE

AMD PACE is a high-performance inference solution for LLMs on AMD platforms, offering a PyTorch extension for rapid integration of new kernels and graph optimizations.

PYNQ

PYNQ is an open-source Xilinx project that simplifies designing embedded systems on Zynq APSoCs, enabling rapid development using Python and flexible hardware overlays.

QONNX

QONNX extends ONNX with custom ops—IntQuant, FloatQuant, BipolarQuant, and Trunc—to represent arbitrary-precision integer and minifloat quantization.

RapidWright

RapidWright is an open-source framework that enables customized, domain-specific FPGA implementation flows, giving developers fine-grained control over design mapping.

RecoNIC

RecoNIC is an RDMA-enabled SmartNIC with compute acceleration, reducing data-copy overhead and moving data closer to computation for faster, more efficient processing.

ROC_SHMEM

rocSHMEM began as an AMD Research effort to deliver GPU-centric networking via an OpenSHMEM-like interface and is now a full production library in the ROCm platform.

Ryzers

This repository offers composable Dockerfiles and build scripts for deploying software, full applications, and demonstrators on AMD Ryzen™ AI hardware.

Tensorcast

TensorCast is a PyTorch-based casting and quantization library focused on OCP MX and AMD-relevant low-precision datatypes, providing tools and reference code for verification.