Porting RTL Designs to Vitis RTL Kernels
Mar 17, 2023
1. Overview
Hardware programmability is the key advantage of FPGA-based accelerators compared to GPU or ASIC. Accelerated computing is one of the most important applications of FPGAs. Powerful AMD FPGA devices and Alveo accelerator cards can help you achieve extensive adaptive acceleration in high performance computation and other diverse areas. AMD’s Vitis Unified Software Platform enables easy and efficient hardware acceleration for applications. The Vitis toolchain includes an HLS (high-level synthesis) toolset for designing FPGA hardware with C language and support for traditional RTL based design methodology. The RTL design flow is a mature methodology supported by Vivado that offers performance and resource advantages.
Experienced users with mature or well-verified RTL modules in Vivado design flows can utilize the Vitis design flow for speeding up new hardware acceleration projects. This article explains the procedure for porting existing RTL designs to RTL kernels for Vitis.
2. RTL Kernel Types
The Vitis flow supports two kinds of RTL kernels: the XRT-managed RTL kernel and the user-managed RTL kernel. You can decide which style to use based on the design characteristics.
XRT-managed RTL Kernel
XRT-managed RTL kernels are controlled by the XRT Native API directly. The hardware interfaces required for XRT-managed RTL kernel are:
Programmable interface: AXI4-Lite control slave interface. The kernels can only have a single AXI4-Lite control slave interface for register access. The XRT controls the kernel execution by this AXI4-Lite slave interface.
Data interface: any number and combination of AXI4 memory mapped master and AXI4-Stream master/slave interface. The memory mapped masters are used by the kernel to access global memory and the AXI4-Steam ports are used for inter-kernel communication.
Clock and reset: the RTL kernel needs general clock and reset signals input
In XRT-managed RTL kernels, some requirements are imposed on the control register’s definition in the AXI control slave. The following table explains the register address map requirements.
offset | Name | Description |
---|---|---|
0x0 | Control | Controls and provides kernel status |
0x4 | Global Interrupt Enable | Used to enable interrupt to the host |
0x8 | IP Interrupt Enable | Used to control which IP generated signals are used to generate an interrupt |
0xC | IP Interrupt Status | Provides interrupt status |
0x10 - | Kernel arguments | Includes scalars and global memory access address offset arguments |
In the above table, the key register for kernel execution control is the register in address offset 0x0, namely Control. The registers and its signals are determined by the kernel execution mode. The following table shows the control signals that are accessed through the control register.
Bit | Name | Description |
---|---|---|
0 | ap_start | Asserted by the XRT when the kernel can start processing data. Cleared on handshake with ap_done being asserted |
1 | ap_done | Asserted by the kernel when it has completed operation and finished producing the output data, Cleared on read |
2 | ap_idle | Asserted when the kernel is idle |
3 | ap_ready | Asserted by the kernel when it is ready to accept the new data |
4 | ap_continue | Asserted by the XRT to allow kernel keep running |
7 | auto_restart | Used to enable automatic kernel restart |
Others | Reserved | Reserved |
When designing XRT-managed RTL kernel, the requirements described above must be fulfilled.
The XRT-managed RTL kernel can support one of three execution models, namely ap_ctrl_hs, ap_ctrl_chain, and ap_ctrl_none. The ap_ctrl_hs and ap_ctrl_chain models are realized by the control signals corresponding to the bits in control register located at the 0x0 offset.
Using ap_ctrl_chain model as an example, the following figures depict the required signal bits from the control register 0x0.
The kernel execution XRT control flow is shown in the table below.
Input Synchronization |
---|
|
Output Synchronization |
|
With the XRT-managed RTL kernel, you can get the high-level kernel execution analysis report from Vitis Analyzer in hardware emulation (hw_emu) mode, just like the Vitis HLS kernel. This is an additional advantage from using the XRT-managed kernel type.
User-managed RTL Kernel
The user-managed RTL kernel has similar hardware interface requirements as the XRT-managed kernel. These include clock and reset signals, an AXI slave for kernel control, AXI masters for global memory data load/store, and AXI-Stream ports for inter-kernel data communication. However, the design does not need to satisfy the control requirements of XRT and can implement any of a variety of execution mechanisms. There is no prescribed method of starting, stopping, or controlling your kernel. This is largely up to you and the specific requirements of the application or system.
Some of the available control schemes include:
Accessing registers through the AXI control slave interface via XRT register read/write API
Accessing the hardware through software drivers, such as UIO drivers, implemented in the host application
Triggering the start or stop response of the kernel from a signal provided by a separate component, or from another kernel
Providing a data-driven approach, such as using the AXI Stream port or auto-restarting
The XRT-managed kernels can be thought of being a specialized form of user-managed kernels. However, for user-managed kernels, it is not possible to get the high-level kernel execution analysis report from the Vitis Analyzer in hardware emulation (hw_emu) mode; you can still use the simulation waveform to analyze the behavior of the kernel in hardware emulation.
3. Porting Existing RTL Designs to RTL Kernels
Before starting to port your existing RTL designs to Vitis RTL kernels, you must first decide whether to use an XRT-managed or user-managed kernel style.
Generally, with the XRT-managed RTL kernel, you can utilize the built-in XRT high level native API to control the start and finish of the kernel execution. You can also use the XRT managed kernel execution queue to easily construct input data backpressure for kernel to hide PCIe transfer penalty, improve kernel utilization and reduce latency. You don't need to implement these features with complicated register write/read scheduling by yourself.
However, for some designs, the core functions might not be suited for XRT control, or it might not be easy to adapt the existing RTL code to satisfy the XRT-managed kernel control model. In these situations, you can use the user-managed RTL kernel style that provides maximal flexibility and requires little or no modification to the existing designs.
Follow the steps below to port your RTL design or Vivado project to the Vitis kernel.
Modify the Design to Satisfy RTL Kernel Requirements
You must make the necessary modification to your existing design to satisfy the Vitis RTL kernel hardware interface requirements. These include clock and reset signal names, AXI-Lite control slave interface configuration and AXI master behavior, etc. Refer to the relevant documents for more details about these requirements (i.e. UG1393)
If you decide to use the XRT-managed RTL kernel style but you are not proficient at it, the RTL Kernel Wizard is a good starting point as a quick reference for the RTL design ready for Vitis kernel porting flow. The RTL Kernel Wizard can be launched in the Vitis or Vivado GUI environment, and it includes a full RTL kernel Vivado project with example code for AXI control slave, a simple vector adder module, as well as the full HDL wrapper for the RTL kernel design.
Package the Design into Vivado IP and Convert it to Vitis Kernel
One key step for porting the RTL design to Vitis is to package the design into a Vitis kernel file (XO file). You can use the GUI version of the IP Packager in Vivado to package the design into Vivado IP, and then generate the XO file. Vivado also provides a command line flow for Vitis kernel generation, where Vivado Tcl commands can be used in place of the GUI.
Four steps are required to finish the RTL design to Vitis Kernel conversion:
Step 1: Create a Vivado project and add design sources. If you are developing the RTL design with Vivado, this step is skipped.
Step 2: Infer clock, reset, AXI interfaces and associate them with clock. Define and associate the necessary ports to satisfy Vitis RTL kernel requirement.
Step 3: Set the definition of AXI control slave registers, including CTRL and user kernel arguments. This step is for XRT-managed RTL kernels only. If you are using a user-managed kernel, skip this step and the host program will access the control register by address offset.
Step 4: Package Vivado IP and generate the Vitis kernel file. In this step, the RTL design source code is packed into a Vivado IP and converted to a Vitis kernel (.xo) file.
4. Host Programming RTL Kernels Control
XRT provides an easy programming environment for Vitis acceleration kernel control. XRT can handle input/output data transfers between the host and the device, as well as control the execution of the kernel. The underlying hardware driver development does not matter. Whether it is a Vitis HLS kernel or Vitis RTL kernel, the structures of the host application are similar. You need a few steps in host program to control the RTL kernel:
Specify the accelerator device ID and load the XCLBIN file (generated in Vitis linking).
Set up the PL kernel and kernel arguments.
Transfer data between the software application and PL kernels.
Run the kernel and return results.
The first three steps are the same for both XRT-managed kernels and user-managed kernels, and the key differences between them lie in the kernel control programming model, as summarized in the table below.
XRT-Managed Kernels | User-Managed Kernels |
---|---|
|
|
For more details about the XRT Native APIs for RTL kernel host programming, please refer to XRT Native APIs.
5. Summary
RTL kernels are well supported in the Vitis flow; porting existing RTL designs into Vitis RTL kernels can easily be accomplished. Two different RTL kernel types, XRT-managed kernel and user-managed kernel are supported; you can choose between the two according to the original RTL design characteristics and usage scenario. XRT provides a simple programming interface to control the execution of RTL kernels; you can select the appropriate XRT API to control the two types of RTL kernels.