AOCL-BLAS
AOCL-BLAS provides a high-performance implementation of the Basic Linear Algebra Subprograms (BLAS), which provide the essential kernels of matrix and vector computation—among the most used computationally intensive operations in dense numerical linear algebra. Select kernels have been optimized for the AMD “Zen”-based processors, including AMD EPYC™, AMD Ryzen™, and AMD Ryzen™ Threadripper™ processors.
AOCL-BLAS is developed as a forked version of BLIS (https://github.com/flame/blis), originally developed by members of the Science of High-Performance Computing (SHPC) group in the Institute for Computational Engineering and Sciences at The University of Texas at Austin and other collaborators (including AMD). The library retains all known BLIS features and adds:
- Standard BLAS and CBLAS interfaces
- C++ template interfaces for BLAS functionalities
Highlights of AOCL-BLAS 5.3
- Performance improvements in S/D/ZGEMM on Zen3/4/5
- SGEMM Optimizations for tiny matrices
- New Thread Control APIs with global and thread-local variants
- Support for OpenMP 2.5 and earlier versions
- Optional support for reproducibility using compiler options
- Updates to aocl-gemm add-on module
- Column Major support for BF16 and FP32
- FP32 RD kernels for AVX512 and AVX2 ISA
- GEMV kernel for m=1 case using AVX2 and AVX512 YMM registers
- You can find the package containing AOCL-BLAS Library binaries that includes optimizations for AMD processors, examples, and documentation in the Downloads section.
Documentation
- Source code: GitHub.
AOCL-LAPACK
AOCL-LAPACK is a high-performance implementation of Linear Algebra PACKage (LAPACK), which provides routines for solving systems of linear equations, least-squares problems, eigenvalue problems, singular value problems, and the associated matrix factorizations. Extensible, easy to use, and available under an open-source license, AOCL-LAPACK can be utilized by applications relying on standard Netlib LAPACK interfaces with virtually no changes to their source code. AOCL-LAPACK supports C, Fortran, and C++ template interfaces (for a subset of APIs) for the LAPACK APIs.
AOCL-LAPACK is compatible with the LAPACK 3.12.0 specification. Combined with the AOCL-BLAS library, which includes optimizations for the AMD “Zen”-based processors, AOCL-LAPACK enables running high performing LAPACK functionalities on AMD platforms.
Highlights of AOCL-LAPACK 5.3
- Improved performance of the following routines
- QR factorization, Singular Value Decomposition (DGELSS, DORGQR, SGESDD)
- Matrix Inverse routine DPOTRI for medium sizes.
- Usability Improvements
- All internal code logic updated to use 64-bit integers to extend the range of matrix sizes supported
- Test suite enhancements
- Extended BRT test coverage to remaining APIs
- Introduced separate functional and performance test modes
- Added API‑specific YAML‑based ctests to improve test coverage
Documentation
- AOCL-LAPACK API Guide
- Prior versions: AOCL-BLAS and AOCL-LAPACK Library Archive
- Source code: GitHub
Downloads
| File Name | Version | Size | Launch Date | OS | Bitness | Description |
| Binary packages compiled with AOCC 5.2 | ||||||
| aocl-blis-linux-aocc-5.3.0.tar.gz | 5.3 | 21MB | 05/18/2026 | RHEL, Ubuntu, SLES | 64-bit | AOCC compiled AOCL-BLAS library binary package SHA-256 checksum: ba72662b7606f2dc4ebbed2821a89cc2bc1ca0d72e10b4c5aa87a6e4e5b460ee |
| aocl-libflame-linux-aocc-5.3.0.tar.gz | 5.3 | 31MB | 05/18/2026 | RHEL, Ubuntu, SLES | 64-bit | AOCC compiled AOCL-LAPACK Library binary package SHA-256 checksum: bac4f625e89ae85fde6d5d0fe58af716b63835b4fcc97f4f30b397101797b5a4 |
| Binary packages compiled with GCC 14.2.1 | ||||||
| aocl-blis-linux-gcc-5.3.0.tar.gz | 5.3 | 29MB | 05/18/2026 | RHEL, Ubuntu, SLES | 64-bit | GCC compiled AOCL-BLAS library binary package SHA-256 checksum: ef68ae1854361aaa2ba62bf4cbb43f04caedc8d0227454057e62374a67543cca |
| aocl-libflame-linux-gcc-5.3.0.tar.gz | 5.3 | 33MB | 05/18/2026 | RHEL, Ubuntu, SLES | 64-bit | GCC compiled AOCL- LAPACK Library binary package SHA-256 checksum: dad58279b59ea4e8aa448764dd0dbc5e04c050be8ef30041d362993ba29ca43c |
| Windows Installer Compiled with Clang 19 | ||||||
| AOCL_Windows-setup-5.3.0-AMD.exe | 5.3 | 154MB | 05/18/2026 | Windows 11, Windows 10 | 64-bit | Windows installer file containing all the AOCL library binaries compiled with Clang 19. SHA-256 checksum: 021bfd69a439c3c2a72a6b5cf45d1de0e2f0deddb92ba19e73b9caeb259cf9c8 |