5. Performance Modelling using AMDuProfPcm

5.1. Classic Roofline Model

AMDuProfPcm provides basic roofline modeling that relates the application performance to memory traffic and floating point computational peaks. This is a visual performance model offering insights on improving the parallel software for floating point operations. This helps to characterize an application and identify whether a benchmark is memory or compute bound.

The tool monitors the memory traffic and floating point operations when the profiled application is running. Also, it computes the Arithmetic Intensity that is Operations per byte of DRAM traffic [FLOPS/BYTE]. The roofline chart is plotted as:

By default, the tool plots horizontal rooflines for:

The options available to plot the max peak horizontal (computational) peak rooflines are:

5.2. Command Line Options

Generating the roofline chart of an application:

  1. Collect and generate roofline HTML plot using AMDuProfPcm.

    $ AMDuProfPcm roofline -O /tmp -- /tmp/myapp.exe
    

    An output directory is created in the specified dir (/tmp) in the format AMDuProfPcm-Roofline-<date>-<time> which contains the HTML report (report.html). Open the HTML report and view the roofline graph located in the “” tab.

    On AMD Zen 4 and later processors, if the Linux kernel version doesn’t support accessing DF counters, use the following command with root privilege.

    $ AMDuProfPcm roofline --msr -O /tmp/ -- /tmp/myapp.exe
    
  2. To generate the PDF roofline chart, run the following command (to be deprecated).

    $ AMDuProfModelling.py -i /tmp/myapp-roofline.csv -o /tmp/ --memspeed 3200 -a myapp
    

    The roofline chart is saved in the file /tmp/AMDuProf_roofline-yyyy-mm-dd-hhmmss.pdf.

A few pointers for generating the roofline chart:

Example

$ AMDuProfModelling.py -i /tmp/myapp-roofline.csv -o /tmp/ --memspeed 3200 -a myapp -dp-roofs

Sample Roofline Chart

Sample roofline chart

Figure 5.1 Sample Roofline Chart#