AMD uProf supports .NET/CLR application profiling using common language runtime (CLR) profiler using profiling API.
AMD uProf provides CLR Agent libraries: AMDClrProfAgent.dll on Windows. This CLR Agent library must be loaded during startup of the target managed process.
Use the dotnet --version command to ensure that the .NET is installed on the system.
If the .NET application is launched by AMD uProf, the tool would set COR_PROFILER environment variable which specifies the CLSID of the AMDClrProfAgent library which uses ICLRProfiling Interface to attach the agent to running CLR process AMD uProf would be able to collect the profile data and attribute the samples to CLR process.
To launch the AMDuProf GUI, go to Home > Welcome page.
Click Profile an Application on the Welcome page.
Provide application path, application options, working directory, and environment variables, if any. Click Next.
Specify the following configuration parameters:
Application Path: Path to .NET binary
Application Options: Launch app arguments
Working Directory: Launch app path
From Predefined Configs, select Required Configuration.
Click Start Profile to start the profiling.
To profile a .NET application:
$ ./AMDuProfCLI collect --config tbp -w <.net-app-dir> <path-to-.net.exe> <.net-app-main>
To generate a report:
$ ./AMDuProfCLI report --src-path <path-to-.net-app-source-dir> -i <raw-data-file-path>
Click the ANALYZE tab to identify the hottest .NET functions.
Figure 8.1 Function Hotspots#
AMD uProf will attribute the profile samples to .NET methods and the source tab will show the .NET source lines with the corresponding samples attributed to them.
Refer the section Source and Assembly for more information on this screen. The following figure shows the source view of the .NET method:
Figure 8.2 Source view - .NET Method#
.NET profiling is supported only on Windows platforms.
uProf does not support Visual Studio 2022 Properties >Build > Optimize Code option.
uProf does not support Visual Studio 2022 Properties >Build > Debug Symbols Portable and embedded options.
AMD uProf supports Java application profiling running on JVM. To support this, it uses JVM Tool Interface (JVMTI).
AMD uProf provides JVMTI Agent libraries: AMDJvmtiAgent.dll on Windows and libAMDJvmtiAgent.so on Linux. This JvmtiAgent library must be loaded during start up of the target JVM process.
Use the which java command and ensure that Java is installed on the system.
Run echo $JAVA_HOME to see if it is pointing to the supported Java version.
When the Java application is launched by AMD uProf, the tool can collect the profile data and attribute the samples to interpreted Java functions.
Using GUI
To launch the AMDuProf GUI, go to Home > Welcome page.
Click Profile an Application on the Welcome page.
Provide application path, application options, working directory, and environment variables, if any. Click Next.
Specify the following configuration parameters:
Application Path: Path to Java binary
Application Options: Launch app arguments
Working Directory: Launch app path
From Predefined Configs, select Required Configuration.
Set the timer interval and profiling signal.
Only if using Linux: From Advanced Options, select the Callstack Collection and Callstack Unwind Depth.
Click Start Profile to start the profiling.
Using CLI
To profile a Java application:
$ ./AMDuProfCLI collect --config tbp -w <java-app-dir> <path-to-java.exe> <java-app-main>
On A Linux platform Java callstack collection:
$ ./AMDuProfCLI collect --config tbp –g -w <java-app-dir> <path-to-java.exe> <java-app-main>
To generate report:
$ ./AMDuProfCLI report --src-path <path-to-java-app-source-dir> -i <raw-data-file-path>
Note
Use absolute paths as arguments when running a Java executable.
We can attach the running Java application using --pid option. AMD uProf would be able to collect the profile data and attribute the samples to interpreted Java functions.
To launch the AMDuProf GUI, go to Home > Welcome page.
Click Profile an Application on the Welcome page.
Click Profile running Process(es) on the Welcome page..
From Predefined Configs, select GPU Profile.
Figure 8.3 Profile Target Java Process#
Select the process Id.
From Advanced Options, select Callstack Collection and Callstack Unwind Depth.
Click Start Profile to start the profiling.
To profile a running Java process
$ ./AMDuProfCLI collect --config tbp -p < java-app-launch-pid> -o /tmp/ -d <duration>
To profile running Java process with callstack collection
$ ./AMDuProfCLI collect --config tbp –g -p < java-app-launch-pid> -o /tmp/ -d <duration>
To generate a report:
$ ./AMDuProfCLI report --src-path <path-to-java-app-source-dir> -i <raw-data-file-path>
Note
Default duration is 30 seconds.
AMD uProf cannot attach JvmtiAgent dynamically to an already running JVM. Hence, for any JVM process profiled by attach-process mechanism, AMD uProf cannot capture any class information, unless the JvmtiAgent library is loaded during JVM process start up.
To profile an already running Java process, pass -agentpath <path-to-agent-lib>option while launching Java application so that AMD uProf can attach to the Java PID to collect profile data later.
For a 64-bit JVM on Windows and FreeBSD
C:\> java -agentpath:<C:\ProgramFiles\AMD\AMDuProf\bin\ProfileAgents\x64\AMDJvmtiAgent.dll> <java-app-launch-options>
Note the process id (PID) of this JVM instance.
To launch the AMDuProf GUI, go to Home > Welcome page.
Click Profile an Application on the Welcome page.
Click Profile running Process(es) on the Welcome page..
From Select Profile Target, select Process(es).
Select the Java process Id.
From Advanced Options, select Callstack Collection and Callstack Unwind Depth.
Click Start Profile to start the profiling.
AMD uProf will attribute the profile samples to Java methods and the source tab will show and the Java source lines with the corresponding samples attributed to them.
Refer the section Source and Assembly for more information on the source screen.
The following figure shows the source view of the Java method:
Figure 8.4 Java Method - Source View#
Note
For Java attach to process on Linux, pass the JVM option (-XX:+PreserveFramePointer) while launching the target application to collect correct java app callstack using AMD uProf.
To collect call stack for profiling Java application:
$ ./AMDuProfCLI collect --config tbp -g -w <java-app-dir> <path-to-java-exe> <java-app-main>
Figure 8.5 Java Application - Flame - Graph#
Java profiling has the following limitations:
Java call stack profiling is supported only on Linux platforms.
CLI option --tid does not support attaching uProf to a running Java thread.
Single instance of uProf can attach the Java agent to only one target Java application. If multiple target Java app specified, then the first target Java app will be picked.
It supports attaching Java applications of Java-11 and all newer versions. (Any - other older Java versions are not supported.)
AMD uProf provides comprehensive profiling capabilities for Python applications, enabling developers to identify and analyze performance bottlenecks and extended execution times on Linux operating systems. The profiler utilizes the Python interpreter’s runtime support mechanisms to perform detailed Hotspot Analysis, facilitating performance characterization and application optimization.
Key Features
Non-Intrusive Profiling: Profile applications without modifying the source code.
Function-Level Granularity: Capture performance metrics at the Python function level.
Source-Level Analysis: Analyze performance line by line using eBPF mode.
Mixed-Mode Analysis: Provide a unified profiling view for both native and Python code.
Call Stack Collection: Capture complete call stacks with mixed native/Python execution traces.
Multi-Process and Multi-Thread Support: Profile applications utilizing multiple processes and threads.
Profiling Modes
AMD uProf provides two profiling modes for Python applications, each with distinct capabilities and version support:
eBPF Sampling Mode: Available exclusively for Python 3.10
Tracing Mode: Available for Python 3.10, 3.11, 3.12, and 3.13
The following table details the feature support matrix for each profiling mode:
Feature |
eBPF Sampling Mode |
Tracing Mode |
|---|---|---|
Function-Level Attribution |
Yes |
Yes |
Source-Level Attribution |
Yes |
No |
Mixed-Mode Profiling (Native + Python) |
Yes |
Yes |
Call Stack Analysis |
Yes (requires frame pointer) |
Yes |
Multi-Process Support |
Yes |
Yes |
Multi-Threaded Support |
Yes |
Yes |
Launch Application Profiling |
Yes |
Yes |
Attach to Running Process |
Yes |
No |
eBPF Sampling mode utilizes the Linux kernel’s extended Berkeley Packet Filter (eBPF) functionality to collect performance samples with minimal overhead. This mode provides comprehensive profiling capabilities including source-level attribution and mixed-mode analysis.
Leverages kernel eBPF infrastructure for low-overhead sampling
Provides both function-level and source-level performance attribution
Supports mixed-mode profiling (native and Python code)
Requires frame pointer preservation for accurate call stack collection
Targets exclusively for Python 3.10 exclusively.
Administrator privileges are required once to configure Linux capabilities for the AMD uProf eBPF library:
sudo ./AMDuProfSetup.sh
For accurate call stack collection, Python binaries must be built with frame pointers enabled. To do so:
Download Python source: GitHub - python/cpython
Checkout version: git checkout 3.10
Configure: ./configure --enable-loadable-sqlite-extensions --with-openssl-rpath=auto CFLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer"
Build: make
The eBPF Sampling mode integrates with Hotspot Analysis to profile Python applications through both launch and attach profiling workflows. When a Python interpreter path is specified as the target executable, the profiler enables Python-specific sample collection in parallel with standard Hotspot Analysis. This collection encompasses samples from all child processes and threads spawned by the target Python script.
The Python Profiling Agent supports profiling applications launched directly via Python interpreter as well as those initiated through shell scripts.
Launch AMDuProf and navigate to Home > Welcome page.
Select Profile an Application.
Configure the target application parameters including the executable path, command-line arguments, working directory, and environment variables as needed. Click Next to proceed.
Specify the following configuration parameters:
Application Path: Path to python binary
Application Options: Application arguments
Working Directory: Application path
Figure 8.6 Python Application Configuration#
From the Predefined Configs section, select Hotspots Configuration.
Figure 8.7 Hotspots Profile Selection#
Expand the Advanced Options section and enable eBPF mode Python Profiling.
Figure 8.8 eBPF-based Python Profiling Activation#
Click Start Profile to begin the profiling session.
The --python option enables eBPF-based sampling for Python profiling.
To collect Python hotspot data:
AMDuProfCLI collect --config hotspots -o <output-dir> --python python <application.py>
To collect Python hotspots with call stack information:
AMDuProfCLI collect --config hotspots -g -o <output-dir> --python python <application.py>
The -g option enables call stack sample collection for both Python and native code execution.
Note
The Python binary must be compiled with frame pointer support enabled to ensure accurate call stack collection.
To profile a Python script launched via shell script:
AMDuProfCLI collect --config hotspots -o <output-dir> --python <application.sh>
You can attach AMD uProf to the running Python processes using the --pid option, enabling profile data collection and sample attribution to Python functions. The --python flag activates eBPF-based sampling for Python profiling.
Launch the AMDuProf GUI and navigate to Home > Welcome page.
Click Profile an Application.
Click Profile running Process(es).
From Predefined Configs, select Hotspots Configuration.
Select the process ID.
Click Advanced Options and select eBPF mode Python Profiling.
Click Start Profile.
To profile an already running Python process:
AMDuProfCLI collect --config hotspots -g -o <output-dir> --python -p <python-process-id>
Specify the process ID of the target Python process using the -p option.
Alongside native data timing metrics the Python profiler will also enable hotspots for the Python application. Identifying the hottest functions in Hotspot analysis is mentioned in Hotspots Analysis.
Figure 8.9 Python Hotspots#
Use Flame Graph to identify hottest code paths of an application, the code path contains both native and Python functions.
Figure 8.10 Function Hotspots#
The code path identification of Hotspot Analysis can be found at Hotspots Analysis.
AMD uProf attributes performance samples to Python functions and displays corresponding source line annotations in the Source tab view.
For detailed information on source-level analysis capabilities, refer to Source and Assembly.
Figure 8.11 Source view - Python Method#
Note
Assembly-level profiling is not supported for Python applications.
In addition to the Hotspots Analysis Limitations, Python profiling has the following constraints:
eBPF Sampling mode is available exclusively only for CPython 3.10 distribution.
Assembly-level profiling is not supported for Python applications.
Accurate call stack collection requires Python binaries compiled with frame pointer support.
Profiling of executable binaries created from Python scripts is not supported.
Thread-level profiling using the --tid option is not supported for Python applications.
Python profiling is only supported with the predefined Hotspots configuration.
Tracing mode employs Python’s native tracing hooks to monitor function execution and collect performance data. This mode offers broad compatibility across multiple Python versions while maintaining function-level profiling capabilities.
Provides function-level performance attribution.
Supports mixed-mode profiling (native and Python code).
Does not require frame pointer preservation.
Supported for Python 3.10, 3.11, 3.12, and 3.13.
Introduces higher overhead compared to eBPF Sampling mode.
Use the which python command to ensure that the Python alias is pointing to the supported Python interpreter versions.
Use Hotspot Analysis to profile python application using launch application profile scope. When the target executable is a path to python interpreter, the profiler will automatically initiate sample collections for the Python application alongside the Hotspot Analysis. The samples gathered by the Python Profiling Agent will include those from child processes and threads created by the targeted Python script.
Additionally, the Python Profiling Agent also supports profiling Python applications launched from shell scripts.
To launch the AMDuProf GUI, go to Home > Welcome page.
Click Profile an Application on the Welcome page.
Click Profile running Process(es) on the Welcome page.
From Select Profile Target, select Hotspots.
Set the timer interval and profiling signal.
From Advanced Options, select Callstack Collection and Callstack Unwind Depth.
Click Start Profile to start the profiling.
Python hotspots collection
AMDuProfCLI collect --config hotspots -o <output-dir> python <application.py>
Python hotspots with callstack collection
AMDuProfCLI collect --config hotspots -g -o <output-dir> python <application.py>
Python hotspots when python script is launched from shell script
AMDuProfCLI collect --config hotspots -o <output-dir> <application.sh>
To show python interpreter functions in the callgraph/flamegraph
AMDuProfCLI report -i <python_session> --detail --python-show-all
Note
Default sampling interval is 10 seconds.
For more information refer to Hotspots Analysis.
Use Hotspot Analysis to identify the hottest Python functions and a mixed-mode call stack (both native and Python).
Alongside native data timing metrics the Python profiler will enable hotspots for the Python application. Identifying the hottest functions in Hotspot analysis is mentioned in Hotspots Analysis.
Figure 8.12 Function Hotspots#
Use Flame Graph to identify hottest code paths of an application, the code path contains both native and Python functions.
Figure 8.13 Function Hotspots#
The code path identification of Hotspot Analysis can be found at Hotspots Analysis.
Along with Hotspots Analysis Limitations, Python profiling has a few other limitations:
Only CPython distribution versions 3.10, 3.11, 3.12, and 3.13 are supported.
Supported for launch application profile scope only.
Supported only on Linux operating systems.
Python source code profiling is not supported.
Python instruction level profiling is not supported.
Only Hotspots Analysis is supported.
The MPI programs launched through mpirun or mpiexec launcher programs can be profiled by AMD uProf. To profile the MPI applications and analyze the data, complete the following the steps:
Collect the profile data using CLI collect command.
Process the profile data using CLI translate command which will generate the profile database.
Import the profile database in the GUI or generate the CSV report using CLI report command.
Multiple ranks profiling requires higher limit to be set for memory locking using one of the following methods: - Increase the memory lock limit using the command ulimit -l, depending on the number of ranks to be profiled on the target node. - Set proc/sys/kernel/perf_event_paranoid to -1 or higher value based on the profile config and scope. - Profile MPI applications with root privilege.
Multiple ranks profiling might require a high number of file descriptors. If the file descriptor limit is reached during profile data collection, an error message will be displayed. You can increase this limit in the file /etc/security/limits.conf.
For Multiple ranks profiling, if the /proc/sys/kernel/perf_event_paranoid value is greater than -1, you must increase the /proc/sys/kernel/perf_event_mlockb value depending on the number of ranks to profile. Alternatively, you can also use the -m option to decrease the number of memory data buffer pages used by each instance of AMDuProfCLI.
Support Matrix
The profiling of MPI applications supports components and their corresponding versions provided in this MPI Trace Support Matrix.
The MPI jobs are launched using MPI launchers such as mpirun and mpiexec. Use AMDuProfCLI to collect the CPU profile data for an MPI application.
The MPI job launch through mpirun uses the following syntax:
$ mpirun [options] <program> [<args>]
AMDuProfCLI is launched using <program> and the application is launched using the AMDuProfCLI’s arguments. So, use the following syntax to profile an MPI application using AMDuProfCLI:
$ mpirun [options] AMDuProfCLI [options] <program> [<args>]
The specific AMDuProfCLI flags for profiling MPI applications:
--mpi option must be specified for multi-rank job launched via mpirun/mpiexec to collect the profiling data (e.g. CPU profiling data). This option is to notify uProf that the MPI launcher is used. Only specifying this option does not enable CPU profiling data collection. Refer Application Analysis - Getting Started for more details on how to enable profiling data collection.
--trace mpi option can be specified to collect the MPI trace data. When this option is specified, --mpi option can be omitted. Refer Parallelism MPI Trace Analysis for more details.
--output-dir <output dir> specifies the path to a directory in which the profile files are saved. A session directory will be created within the <output dir> containing all the data collected from all the ranks.
A typical command uses the following syntax:
$ mpirun -np <np> /tmp/AMDuProf/bin/AMDuProfCLI collect –config <config-type> --trace mpi --output-dir <output_dir> [mpi_app] [<mpi_app_options>]
If an MPI application is launched on multiple nodes, AMDuProfCLI will profile all the MPI rank processes running on all the nodes. You can analyze the data for processes run on one/many/all node(s).
Method 1 - Profile All the Ranks On Single/Multiple Node(s)
To collect profile data for all the ranks running on a single node, execute the following commands:
$ mpirun -np 16 /tmp/AMDuProf/bin/AMDuProfCLI collect --config tbp --trace mpi --output-dir /tmp/myapp-perf myapp.exe
To collect profile data for all the ranks in multiple nodes, use the options -H / --host mpirun or specify -hostfile <hostfile>:
$ mpirun -np 16 -H host1,host2 /tmp/AMDuProf/bin/AMDuProfCLI collect --config tbp --trace mpi --output-dir /tmp/myapp-perf myapp.exe
$ mpirun -np 16 -H host1,host2 /tmp/AMDuProf/bin/AMDuProfCLI collect
--config tbp --mpi --output-dir /tmp/myapp-perf myapp.exe
Method 2 - Profiling Specific Rank(s)
To profile only a single rank running on host2, execute the following commands:
$ export AMDUPROFCLI_CMD=/tmp/AMDuProf/bin/AMDuProfCLI collect --config tbp --trace mpi --output-dir /tmp/myapp-perf
$ mpirun -np 4 -host host1 myapp.exe : -host host2 -np 1 $AMDUPROFCLI_CMD myapp.exe
To profile only a single rank in setup where 256 ranks running on 2 hosts (128 ranks per host):
$ mpirun -host host1:128 -np 1 $AMDUPROFCLI_CMD myapp.exe : -host host2:128,host1:128 -np 255
--map-by core myapp.exe
The mpirun also takes config file as an input and the AMDuProfCLI can be used with the config file to profile the MPI application.
Config file (myapp_config):
#MPI - myapp config file
-host host1 -n 4 myapp.exe
-host host2 -n 2 /tmp/AMDuProf/bin/AMDuProfCLI collect --config tbp --trace mpi \
--output-dir /tmp/myapp-perf myapp.exe
To run this config to collect data only for the MPI processes running on host2, execute the following command:
$ mpirun --app myapp_config
The data collected for MPI processes can be analyzed using the CSV reported by the AMDuProfCLI report command. The generated reported is saved to the file report.csv in the <output-dir>/<SESSION-DIR> folder.
Following are the reporting options for the CLI:
Generate a report for all the MPI processes ran on the localhost (for example, host1) in which the MPI launcher was launched (using the new option --input-dir):
$ AMDuProfCLI report --input-dir /tmp/myapp-perf/<SESSION-DIR> --host host1
Generate a report for all the MPI processes ran on another host (for example, host2) in which the MPI launcher was not launched:
$ AMDuProfCLI report --input-dir /tmp/myapp-perf/<SESSION-DIR> --host host2
Note
Option --host is not mandatory to create the report file for the localhost.
To analyze the profile data in the GUI, complete the following steps:
To generate the profile database, refer Analyzing the Data with CLI.
To import the profile database, refer Importing Profile Database.
The MPI environment parameters such as Total number of ranks and Number of ranks running on each node are currently supported only for OpenMPI. Profiling of MPI applications with system-wide profiling scope is not supported.
Profiling of MPI applications is not supported with ProfileControlAPIs.
To attribute the samples to the system modules (for example, glibc and libm), AMD uProf uses the corresponding debug info files. The Linux distros do not contain the debug info files, but most of the popular distros provide options to download the debug info files.
Refer the following resources for more information on how to download the debug info files:
Ensure that you download the debug info files for the required system modules for the required Linux distros before starting the profiling.
To profile and analyze the Linux kernel modules and functions:
Enable the kernel symbol resolution.
Do one of the following:
Download and install kernel debug symbol packages and source.
Build Linux kernel with debug symbols.
After the kernel debug info is available in the default path, AMD uProf automatically locates and utilizes that debug info to show the kernel sources lines and assembly in the source view.
Note
Supported OS: Ubuntu 18.04 LTS, Ubuntu 20.04 LTS, and RHEL 8.
To attribute the kernel samples to appropriate kernel functions, AMD uProf extracts required information from the /proc/kallsyms file. Exposing the kernel symbol addresses through /proc/ kallsyms requires setting of the appropriate value to the /proc/sys/kernel/kptr_restrict file as follows:
Set /proc/sys/kernel/perf_event_paranoid to -1.
Set /proc/sys/kernel/kptr_restrict to an appropriate value as follows:
0: The kernel addresses are available without any limitations.
1: The kernel addresses are available if the current user has a CAP_SYSLOG capability.
2: The kernel addresses are hidden.
Set the perf_event_paranoid value using one of the following:
$ sudo echo -1 > /proc/sys/kernel/perf_event_paranoid
$ sudo sysctl -w kernel.perf_event_paranoid=-1
Set the kptr_restrict value using one of the following:
$ sudo echo 0 > /proc/sys/kernel/kptr_restrict
.. code:: console
$ sudo sysctl -w kernel.kptr_restrict=0
On a Linux system, the /boot directory either contains the compressed vmlinux or uncompressed vmlinux image. These kernel files are stripped, have no symbol and debug information. If there is no debug information, AMD uProf will not be able to attribute samples to kernel functions and hence, by default, AMD uProf cannot report kernel functions.
Some Linux distros provide debug symbol files for their kernel which can be used for profiling purposes.
Ubuntu
To download kernel debug info and source code on Ubuntu systems (verified on Ubuntu 18.04.03 LTS):
To trust the debug symbol signing key, execute the following commands:
// Ubuntu 18.04 LTS and later:
$ sudo apt install ubuntu-dbgsym-keyring
// For earlier releases of Ubuntu:
$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F2EDC64DC5AEE1F6B9C621F0C8CAB6595FDFF622
Add the debug symbol repository as follows:
$ echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse deb http://ddebs.ubuntu.com $(lsb_release -cs)-security main restricted universe multiverse deb http://ddebs.ubuntu.com $(lsb_release -cs)-updates main restricted universe multiverse deb http://ddebs.ubuntu.com $(lsb_release -cs)-proposed main restricted universe multiverse" |
\
sudo tee -a /etc/apt/sources.list.d/ddebs.list
Retrieve the list of available debug symbol packages:
$ sudo apt update
Install the debug symbols for the current kernel version:
$ sudo apt install --yes linux-image-$(uname -r)-dbgsym
Download the kernel source using one of the following methods:
$ sudo apt source linux-image-unsigned-$(uname -r)
$ sudo apt source linux-image-$(uname -r)
After the kernel debug info file is downloaded, it can be found at the default path:
$ /usr/lib/debug/boot/vmlinux-`uname -r`
RHEL
Follow the steps in the Red Hat knowledgebase to download the RHEL kernel debug info.
After the kernel debug info file is downloaded, it can be found at the default path: $ /usr/lib/debug/lib/modules/`uname -r`/vmlinux.
If the debug symbol packages are not available for pre-built kernel images, then analyzing the kernel functions at the source level requires a recompilation of the Linux kernel with debug flag enabled.
If the debug info for the kernel modules is available, any subsequent CPU performance analysis will attribute the kernel space samples appropriately to [vmlinux] module and display the hot kernel functions. Otherwise, kernel samples will be attributed to [kernel.kallsyms]_text.
During the hotspot analysis, do consider the following:
If you see the [vmlinux] module, then you should be able to analyze the performance data for kernel functions in the Source view and IMIX view in the GUI. The CLI should also be able to generate source level report and IMIX report for the kernel.
If the source is downloaded and copied to the expected path, then you should be able to see the kernel source lines in GUI and CLI.
Passing of kernel debug file path and passing of kernel source path is not recommended as that might lead to performance issues.
In System-wide profile, the callstack samples can be collected for kernel functions. For example, the following command will collect the kernel callstack:
# AMDuProfCLI collect -a -g -o /tmp/usr/bin/stress-ng --cpu 8 --io 4 --vm 2 --vm-bytes 128M -- fork 4 --timeout 20s
To capture the source line of system-module functions use --show-sys-src.
Example
./AMDuProfCLI report --detail --show-sys-src --src-path /usr/src/linux-version/ -i <session_path>
Pass the path to the kernel source files directory using the --src-path option.
Note these constraints:
Do not move the downloaded kernel debug info from its default path.
If the kernel version gets upgraded, then download the kernel debug info for the latest kernel version. AMD uProf would not show correct source and assembly if there is any mismatch between kernel debug info and kernel version.
While profiling or analyzing kernel samples, do not reboot the system in between. Rebooting the system would cause the kernel to load at a different virtual address due to the KASLR feature of Linux kernel.
The settings in the /proc/sys/kernel/kptr_restrict file enables AMD uProf to resolve kernel symbols and attribute samples to kernel functions. It does not enable the source and assembly level, call-graph analysis.
To view source code for FreeBSD kernel modules in pre-release (ALPHA, BETA, RC, etc.) builds, AMDuProf requires debug symbol files.
Note
In FreeBSD pre-release builds, debug files are stored in a custom build path instead of the standard path /usr/lib/debug/boot/kernel. Use the --symbol-path option to specify where pre-release kernel debug files are located.
Example
To generate a report with the debug symbol path for pre-release builds:
$ ./AMDuProfCLI report --detail --show-sys-src --symbol-path /path/to/debug-file/ -i <session-path>
For more information on the --symbol-path option, refer to AMDuProfCLI Report Command Options.