AOCL-DLP (Deep Learning Primitives) is a high-performance library that provides optimized deep learning primitives for AMD processors. The library implements Low Precision GEMM (LPGEMM) operations for deep learning applications with support for multiple data types, post-operations, and quantization techniques. Select kernels have been optimized for AMD EPYC™ processors, leveraging AVX2, AVX512, AVX512_VNNI, and AVX512_BF16 instruction sets.
AOCL-DLP provides APIs for GEMM operations with various precision formats, comprehensive post-operations for fused computations, batch GEMM support, symmetric quantization routines, and parallel execution via OpenMP.
Highlights of AOCL-DLP 5.2
- Supports GEMM, BatchGEMM APIs for F32, BF16, INT8 data types
- Supports symmetric quantized INT8 APIs
- All APIs support fused elementwise post-operations such as add-bias, ReLU, GeLU (both ERF and tanh variants), Sigmoid, and elementwise matrix addition and multiplication
You can find the package containing AOCL-DLP Library binaries that include optimizations for AMD processors, examples, and documentation in the Downloads section.
Documentation
- AOCL-DLP API Guide
- Source code: GitHub
- Wiki: Github Wiki
Downloads
| File Name | Version | Size | Launch Date | OS | Bitness | Description |
| Binary packages compiled with AOCC 5.1 | ||||||
| aocl-dlp-linux-aocc-5.2.0.tar.gz | 5.2 | 10MB | 12/31/2025 | RHEL, Ubuntu, SLES | 64-bit | AOCC compiled AOCL-DLP library binary package SHA-256 checksum: f734147fc65518cae199d431cdf435d9545d572acfff0795511830fa9e122f51 |
| Binary packages compiled with GCC 14.2.1 | ||||||
| aocl-dlp-linux-gcc-5.2.0.tar.gz | 5.2 | 10MB | 12/31/2025 | RHEL, Ubuntu, SLES | 64-bit | GCC compiled AOCL-DLP library binary package SHA-256 checksum: b5adc2a27422502e06e9c830b6b369d5be5f745f08e0a9472c340fb67ff5f264 |