6. Application Analysis - Getting Started

6.1. Profiling Concepts

6.1.1. Sampling

Sampling profilers works based on the logic that the part of a program that consumes most of the time (or that triggers the most occurrence of the sampling event) have a larger number of samples. This is because they have a higher probability of being executed while samples are being taken by the CPU Profiler.

6.1.2. Sampling Interval

The time between the collection of every two samples is the Sampling Interval. For example, in TBP, if the time interval is 1 millisecond, then roughly 1,000 TBP samples are being collected every second for each processor core.

The purpose of a sampling interval depends on the resource used as the sampling event:

Smaller sampling interval increases the number of samples collected and the data collection overhead. Since, the profile data is collected on the same system in which the workload is running, more frequent sampling increases the intrusiveness of profiling. A very small sampling interval also can cause system instability.

6.1.3. Sampling-point

When a sampling-point occurs upon the expiry of the sampling-interval for a sampling-event, various profile data, such as Instruction Pointer, Process Id, Thread Id, and Call-stack will be collected by the interrupt handler.

6.1.4. Event-Counter Multiplexing

If the number of the monitored PMC events is less than or equal to the number of available performance counters, then each event can be assigned to a counter and monitored 100% of the time. In a single-profile measurement, if the number of monitored events is larger than the number of available counters, the CPU Profiler time-shares the available HW PMC counters. This is called event counter multiplexing. It helps monitor more events and decreases the actual number of samples for each event and thus, reduces the data accuracy. The CPU Profiler auto-scales the sample counts to compensate for this event counter multiplexing. For example, if an event is monitored 50% of the time, the CPU Profiler scales the number of event samples by factor of 2.

6.2. Profile Types

The following profile types are classified based on the hardware or software sampling events used to collect the profile data.

6.2.1. Time-Based Profile (TBP)

In this profile, the profile data is periodically collected based on the specified OS timer interval. It is used to identify the hotspots of the profiled applications.

6.2.2. Event-Based Profile (EBP)

In this profile, the CPU Profiler uses the PMCs to monitor the various micro-architectural events supported by the AMD x86-based processor. It helps to identify the CPU and memory related performance issues in the profiled applications. The CPU Profiler provides several predefined EBP profile configurations. To analyze an aspect of the profiled application (or system), a specific set of relevant events are grouped and monitored together. The CPU Profiler provides a list of predefined event configurations, such as Assess Performance and Investigate Branching. You can select any of these predefined configurations to profile and analyze the runtime characteristics of your application. You also can create your custom configurations of events to profile.

In this profile mode, a delay called skid occurs between the time at which the sampling interrupt occurs and the time at which the sampled instruction address is collected. This skid distributes the samples in the neighborhood near the actual instruction that triggered a sampling interrupt. This produces an inaccurate distribution of samples and events are often attributed to the wrong instructions.

6.2.3. Instruction-Based Sampling (IBS)

In this profile, the CPU Profiler uses the IBS HW supported by the AMD x86-based processor to observe the effect of instructions on the processor and on the memory subsystem. In IBS, HW events are linked with the instruction that caused them. Also, HW events used by the CPU Profiler to derive various metrics, such as data cache latency.

6.2.4. Custom Profile

This profile allows a combination of HW PMC events, OS timer, and IBS sampling events.

6.3. Predefined Sampling Configuration

The Predefined Sampling Configuration provides a convenient way to select a useful set of sampling events for profile analysis. The following table lists all such configurations:

Table 6.1 Predefined Sampling Configurations#

Profile Type

Predefined Configuration Name

Abbreviation

Description

User mode sampling and tracing

Overview Analysis

overview

To get a high level performance snapshot of an application, identify hottest functions and it’s inclusive and exclusive elapsed times, CPU utilization of the threads.

Hotspots Analysis

hotspots

To understand the application code flow and sections of code consuming huge execution time (CPU Time).

Threading Analysis

threading

To identify how efficiently an application uses the processor cores, contention among the application threads due to synchronization, and CPU utilization of the threads.

Note

This configuration is available only on Linux. It is supported only on AMD Zen3, AMD Zen4, and AMD Zen5 processors.

Time-based profile (TBP)

Time-based profile

tbp

To identify where the programs are consuming time.

Event-based profile (EBP)

Assess performance

assess

Provides an overall assessment of the performance.

Assess performance (Extended)

assess_ext

Provides an overall assessment of the performance with additional metrics.

Investigate data access

data_access

To find data access operations with poor L1 data cache locality and poor DTLB behavior.

Investigate instruction access

inst_access

To find instruction fetches with poor L1 instruction cache locality and poor ITLB behavior.

Investigate branching

branch

To find poorly predicted branches and near returns.

Investigate CPI

cpi

To analyze the CPI and IPC metrics of the running application or the entire system.

IBS

Instruction based sampling

ibs

To collect the sample data using IBS Fetch and IBS OP. Precise sample attribution to instructions.

Cache Analysis

memory

To identify the false cache-line sharing issues. The profile data will be collected using IBS OP.

Note

  1. The AMDuProf GUI uses the name of the predefined configuration in the above table.

  2. The abbreviation is used with AMDuProfCLI collect command’s --config option.

  3. The supported predefined configurations and the sampling events used in them is based on the processor family and model.

6.4. Predefined View Configuration

A View is a set of sampled event data and computed performance metrics either displayed in the GUI or in the text report generated by the CLI. Each predefined sampling configuration has a list of associated predefined views.

6.4.1. Assess Performance Configurations

Following is the list of predefined view configurations for Assess Performance:

Table 6.2 Assess Performance Configurations#

View Configuration

Abbreviation

Description

Branch assessment

br_assess

You can use this view to find code with a high branch density and poorly predicted branches.

Data access assessment

dc_assess

Provides information about data cache (DC) access including DC miss rate and DC miss ratio.

IPC assessment

ipc_assess

Find hotspots with low instruction level parallelism, it provides performance indicators – IPC and CPI.

Misaligned access assessment

misalign_assess

You can use this to identify regions of code that access misaligned data.

Overall Assessment

triage_assess

This view gives the overall picture of performance, including the instructions per clock cycle (IPC), data cache accesses/misses, mis-predicted branches, and misaligned data access. You can use it to find the possible issues for a deeper investigation.

6.4.2. Threading Configuration

Following table lists the predefined view configurations for Threading.

Table 6.3 Threading Configurations#

View Configuration

Abbreviation

Description

IPC assessment

ipc_assess

Find hotspots with low instruction level parallelism, it provides performance indicators – IPC and CPI.

Note

This configuration is available only on Linux. It is supported only on AMD Zen3 and AMD Zen4 processors.

Time based hotspots

timer

Use this view to find hotspots where the program is spending most of its time.

All events

all

Use this view to report all collected events and possible computed metrics.

6.4.3. Overview Configuration

Following table lists the predefined view configurations for Overview.

Table 6.4 Overview Configurations#

View Configuration

Abbreviation

Description

IPC assessment

ipc_assess

Find hotspots with low instruction level parallelism, it provides performance indicators – IPC and CPI.

Note

This configuration is available only on Linux. It is supported only on AMD Zen3 and AMD Zen4 processors.

Time based hotspots

timer

Use this view to find hotspots where the program is spending most of its time.

All events

all

Use this view to report all collected events and possible computed metrics.

6.4.4. Investigate Data Access Configurations

The following table lists the predefined view configurations for Investigate Data Access.

Table 6.5 Investigate Data Access Configurations#

View Configuration

Abbreviation

Description

Data access assessment

dc_assess

Provides information about data cache (DC) access including DC miss rate and DC miss ratio.

Data access report

dc_focus

You can use this view to analyze L1 Data Cache (DC) behavior and compare misses versus refills.

DTLB report

dtlb_focus

Provides information about L1 DTLB access and miss rates.

IPC assessment

ipc_assess

Find hotspots with low instruction level parallelism. Provides performance indicators – IPC and CPI.

Misaligned access assessment

misalign_assess

Identify regions of code that access misaligned data.

6.4.5. Investigate Branch Configurations

The following table lists the predefined view configurations for Investigate Branch.

Table 6.6 Investigate Branch Configurations#

View Configuration

Abbreviation

Description

Investigate Branching

Branch

You can use this view to find code with a high branch density and poorly predicted branches.

IPC assessment

ipc_assess

Find hotspots with low instruction level parallelism. Provides performance indicators – IPC and CPI.

Branch assessment

br_assess

You can use this view to find code with a high branch density and poorly predicted branches.

Taken branch report

taken_focus

You can use this view to find the code with a high number of taken branches.

Near return report

return_focus

You can use this view to find code with poorly predicted near returns.

6.4.6. Assess Performance (Extended) Configurations

The following table lists the predefined view configurations for Assess Performance (Extended).

Table 6.7 Assess Performance (Extended) Configurations#

View Configuration

Abbreviation

Description

Assess Performance (Extended)

triage_assess_ext

This view gives an overall picture of performance. You can use it to find possible issues for deeper investigation.

IPC assessment

ipc_assess

Find hotspots with low instruction level parallelism. Provides performance indicators – IPC and CPI.

Branch assessment

br_assess

You can use this view to find code with a high branch density and poorly predicted branches.

Data access assessment

dc_assess

Provides information about data cache (DC) access including DC miss rate and DC miss ratio.

Misaligned access assessment

misalign_assess

Identify regions of code that access misaligned data.

6.4.7. Investigate Instruction Access Configurations

The following table lists the predefined view configurations for Investigate Instruction Access.

Table 6.8 Investigate Instruction Access Configurations#

View Configuration

Abbreviation

Description

IPC assessment

ipc_assess

Find hotspots with low instruction level parallelism. Provides performance indicators – IPC and CPI.

Instruction cache report

ic_focus

You can use this view to identify regions of code that miss in the Instruction Cache (IC).

ITLB report

itlb_focus

You can use this view to analyze and break out ITLB miss rates by levels L1 and L2.

6.4.8. Investigate CPI Configurations

Following table lists the predefined view configurations for Investigate CPI.

Table 6.9 Investigate CPI Configurations#

View Configuration

Abbreviation

Description

IPC assessment

ipc_assess

Find hotspots with low instruction level parallelism, it provides performance indicators – IPC and CPI.

6.4.9. Instruction Based Sampling Configurations

The following table lists the predefined view configurations for Instruction Based Sampling.

Table 6.10 Instruction Based Sampling Configurations#

View Configuration

Abbreviation

Description

IBS fetch overall

ibs_fetch_overall

You can use this view to display an overall summary of the IBS fetch sample data.

IBS fetch instruction cache

ibs_fetch_ic

You can use this view to display a summary of IBS attempted fetch Instruction Cache (IC) miss data.

IBS fetch instruction TLB

ibs_fetch_itlb

You can use this view to display a summary of IBS attempted fetch ITLB misses.

IBS fetch page translations

ibs_fetch_page

You can use this view to display a summary of the IBS L1 ITLB page translations for attempted fetches.

IBS All ops

ibs_op_overall

You can use this view to display a summary of all IBS Op samples.

IBS MEM all load/ store

ibs_op_ls

You can use this view to display a summary of IBS Op load/store data.

IBS MEM data cache

ibs_op_ls_dc

You can use this view to display a summary of DC behavior derived from IBS Op load/store samples.

IBS MEM data TLB

ibs_op_ls_dtlb

You can use this view to display a summary of DTLB behavior derived from IBS Op load/store data.

IBS MEM locked ops and access by type

ibs_op_ls_memacc

You can use this view to display the uncacheable (UC) memory access, write combining (WC) memory access, and locked load/store operations.

IBS MEM translations by page size

ibs_op_ls_page

You can use this view to display a summary of DTLB address translations broken out by page size.

IBS MEM forwarding and bank conflicts

ibs_op_ls_expert

You can use this view to display the memory access bank conflicts, data forwarding, and Missed Address Buffer (MAB) hits.

IBS BR branch

ibs_op_branch

You can use this view to display the IBS retired branch op measurements including mis-predicted and taken branches.

IBS BR return

ibs_op_return

You can use this view to display the IBS return op measurements including the return mis-prediction ratio.

IBS NB local/remote access

ibs_op_nb_access

You can use this view to display the number and latency of local and remote accesses.

IBS NB cache state

ibs_op_nb_cache

You can use this view to display the cache owned (O) and modified (M) state for NB cache service requests.

IBS NB request breakdown

ibs_op_nb_service

You can use this view to display the breakdown of NB access requests.

Views in AMD Zen3 and later processors

IBS fetch overall

ibs_fetch_overall

You can use this view to display an overall summary of the IBS fetch sample data.

IBS fetch instruction cache

ibs_fetch_ic

You can use this view to display a summary of IBS attempted fetch Instruction Cache (IC) miss data (Not supported in AMD Zen3 processors).

IBS fetch instruction TLB

ibs_fetch_itlb

You can use this view to display a summary of IBS attempted fetch ITLB misses.

IBS fetch page translations

ibs_fetch_page

You can use this view to display a summary of the IBS L1 ITLB page translations for attempted fetches.

IBS Branch Analysis

ibs_op_branch

You can use this view to display the IBS retired branch op measurements including mis-predicted and taken branches.

IBS Load Op Analysis

ibs_op_ld

You can use this view to analyze the memory load performance issues of an application.

IBS Load Op Analysis (ext)

ibs_op_ld_ext

You can use this view to analyze the memory load performance issues of an application.

IBS Branch Overview

mibs_op_br_overvie w

You can use this view to analyze the branch metrics.

IBS Load Latency Analysis

ibs_op_ld_lat

You can use this view to analyze the memory load latency performance issues of an application.

IBS Memory Overview

ibs_op_ls_overview

You can use this view to understand the memory access pattern of an application.

IBS Perf Overview

ibs_op_overview

You can use this view to understand the performance characteristics of an application.

New Views added in Zen4 and Zen5 Processors

Front End Bottlenecks

ibs_fetch_front_bottle

You can use this view to show the front end bottlenecks.

Backend Bound Bottlenecks

ibs_op_backend_bottle

You can use this view to show the backend bound bottlenecks.

Bad Speculations

ibs_op_bad_speculation

You can use this to view bad speculations.

Note

6.5. Preparing an Application for Profiling

The AMD uProf uses the debug information generated by the compiler to show the correct function names in various analysis views and to correlate the collected samples to source statements in Source page. Otherwise, the results of the CPU Profiler would be less descriptive, displaying only the assembly code.

6.5.1. Generating Debug Information on Windows

When using Microsoft Visual C++ to compile the application in release mode, set the following options before compiling the application to ensure that the debug information is generated and saved in a program database file (with a .pdb extension). To set the compiler option to generate the debug information for a x64 application in release mode, complete the following steps:

  1. Right-click the project and select Properties from the menu.

  2. From the Configuration drop-down, select Active(Release).

  3. From the Platform drop-down, select Active(Win32) or Active(x64).

  4. In the project pane on the left, expand Configuration Properties.

  5. Expand C/C++ and select General.

  6. In the work pane, select Debug Information Format.

  7. From the drop-down, select Program Database (/Zi) or Program Database for Edit & Continue (/ZI).

Property page to set compiler option to generate debug information.

Figure 6.1 AMDTClassicMatMul Property Page#

  1. In the project pane, expand Linker and then select Debugging.

  2. From the Generate Debug Info drop-down, select /DEBUG.

6.5.2. Generating Debug Information with Inline functions on Windows

To generate debug information with inline functions for a Release build on Windows using Microsoft Visual C++, you need to configure the compiler and linker settings properly. Complete the following steps:

  1. Open Project Properties - Right-click the project and select Properties from the menu.

  2. Select Configuration and Platform

  3. Set Debug Information Format

    Property page to set compiler option to generate debug information.

    Figure 6.2 AMDTClassicMatMul Property Page#

  4. Configure Optimization Settings

    Property page to set compiler option to generate debug information for inline functions.

    Figure 6.3 AMDTClassicMatMul Property Page#

  5. Configure Linker Debugging Settings

    Property page to set linker option to generate debug information.

    Figure 6.4 AMDTClassicMatMul Property Page#

  6. Configure Linker Optimization Settings

    Property page to set linker optimization options for debug information.

    Figure 6.5 AMDTClassicMatMul Property Page#

6.5.3. Generating Debug Information on Linux

The application must be compiled with the -g option to enable the compiler to generate debug information. Modify either the Makefile or the respective build scripts accordingly.

6.6. Workflow

The AMD uProf workflow has the following phases:

  1. Collect: Run the application program and collect the profile data.

  2. Translate: Process the profile data to aggregate, correlate, and organize into database.

  3. Analyze: View and analyze the performance data to identify the bottlenecks.

6.6.1. Collect Phase

Important concepts of the collect phase are explained in this section.

6.6.1.1. Profile Target

The profile target is one of the following for which profile data will be collected:

6.6.1.2. Profile Type

The profile type defines the type of profile data collected and how the data should be collected. The following profile types are supported:

The data collection is defined by Sampling Configuration:

Sampling Configuration identifies the set of Sampling Events, their Sampling Interval, and mode. Sampling Event is a resource used to trigger a sampling point at which a sample (profile data) will be collected. Sampling Interval defines the number of the occurrences of the sampling event after which an interrupt will be generated to collect the sample. Mode defines when to count the occurrences of the sampling event – in User mode and/or OS mode.

Sampled Data — the profile data that is collected when the interrupt is generated (upon the expiry of the sampling interval of a sampling event).

The following table shows the type of profile data collected and sampling events for a profile type:

Table 6.11 Sampled Data#

Profile Type

Type of Profile Data Collected

Sampling Events

GPU Tracing

Runtime Trace — HIP and HSA

Not applicable

GPU Profiling

Perfmon Metrics

Not applicable

CPU Tracing

Collects pthread API, system calls, function trace, page faults and memory allocations

Not applicable

CPU Profiling

  • Process ID

  • Thread ID

  • IP

  • Callstack

  • ETL tracing (Windows only)

  • OpenMP Trace (Linux) — OMPT and OMPLIB

  • MPI Trace — PMPI (Linux)

  • OS Trace — Linux BPF

  • OS Timer

  • Core PMC events

  • IBS

For CPU Profiling, there are numerous micro-architecture specific events available to monitor. The tool groups the related and interesting events to monitor called Predefined Sampling Configuration. For example, Assess Performance is one such configuration used to get the overall assessment of the performance and to find potential issues for investigation. For more information, see Predefined View Configuration.

A Custom Sampling Configuration is the one in which you can define a sampling configuration with events of interest.

6.6.1.3. Profile Configuration

A profile configuration identifies all the information used to collect the measurement. It contains the information about profile target, sampling configuration, data to sample, and profile scheduling details.

The GUI saves these profile configuration details with a default name (for example, AMDuProf-TBP- Classic), you can define them too. As the performance analysis is iterative, this is persistent (can be deleted) and hence, you can also reuse the same configuration for the future data collection runs.

6.6.1.4. Profile Session (or Profile Run)

A profile session represents a single performance experiment for a profile configuration. The tool saves all the profile and translated data (in a database) in the folder <profile config name>-<timestamp>.

Once the profile data is collected, uProf processes the data to aggregate and attribute the samples to the respective processes, threads, load modules, functions, and instructions. This aggregated data is then written into an SQLite database used during the Analyze phase. This process of the translating the raw profile data happens when CLI generates the profile report or GUI generates the visualization.

6.6.2. Translate and Report Phases

The collected raw profile data is processed to aggregate and attribute to the respective processes, threads, load modules, functions, and instructions. The debug information for the launched application generated by the compiler is needed to correlate the samples to functions and source lines.

This phase is performed automatically in the GUI after the profiling is stopped. In the CLI, the report command implicitly processes (translates) the raw profile data and generates the report in CSV format. Also, the CLI provides translate command to perform only the translation and the translated data files can be imported to GUI for visualization.

6.6.3. Analyze Phase

6.6.3.1. View Configuration

A View is a set of sampled event data and computed performance metrics either displayed in the GUI pages or in the text report generated by the CLI. Each predefined sampling configuration has a list of associated predefined views.

The tool can be used to filter/view only specific configurations, which is called Predefined View. For example, IPC assessment view lists metrics such as CPU Clocks, Retired Instructions, IPC, and CPI. For more information, see Predefined View Configuration.

6.6.4. Export Session

The CLI option --export-session helps to generate a compressed archive containing essential session files. The compressed archive can be easily transported to other system and the GUI can be used for analyzing the performance data.

This feature streamlines the process of transferring and utilizing session files across multiple systems, enhancing accessibility and enabling smooth workflow continuity.

Complete the following procedure to export a session:

  1. Generate compressed archive with translate, report, or profile command. A .zip file is generated.

  2. Copy the .zip file to another system and decompress it.

    The decompressed session directory can be imported to GUI for data visualization and analysis. To import the decompressed session and to analyze the performance data, refer to Importing Profile Database.

6.6.4.1. Common Usage

Example

Launch the application AMDTClassicMatMul.exe and collect the Time-Based Profile (TBP) samples and generate a report with the export session option enabled.

6.6.5. Import Session

To analyze an exported session using CLI, click HOME > Import Session to go to the Import Profile Session.

Import the processed profile data collected using the CLI or the processed profile data saved in GUI’s profile session storage path.

Figure 6.6 Import Profile Page#

Using the Import Profile page, you can import the processed profile data collected using the CLI or the processed profile data saved in GUI’s profile session storage path. You must do the following:

6.6.6. Remote Profile

6.6.6.1. Overview

AMD uProf can connect to remote systems and trigger collection, translation of data on the remote system and then visualize it in local GUI.

Note

CLI does not support remote profiling.

AMD uProf uses a separate AMDProfilerService binary that can be launched as an application server on the remote target and local GUI can connect to such a server. By default, authorization must be set up on the server to connect to the local GUI.

Complete the following steps:

  1. Locate the local GUI client ID.

  2. Authorize the client ID on the remote target to connect to AMDProfilerService.

  3. Launch AMDProfilerService with appropriate options/permissions on remote target.

  4. Specify the connection details in the local GUI to connect to the remote target.

  5. Local GUI updates itself and displays the remote data (including settings, session history, available events for profiling/tracing, and so on).

  6. Proceed to import session/profile on the remote target.

  7. When you are done with remote target, disconnect to update the local data in GUI.

Support

Remote profiling from Windows (host/local platform) to Linux (target/remote platform) is supported.

6.6.6.2. Setting up Authorization

To set up authorization:

  1. Navigate to PROFILE > Remote Profile and locate the Client ID.

    Connect to the remote machine and view the Client IDs.

    Figure 6.7 Client ID#

  2. Copy the Client ID (alphanumeric value).

  3. On remote target, navigate to the AMD uProf bin directory and execute the following command:

    AMDProfilerService --add <client_id>
    

    This will authorize the client to connect to this remote target. To revoke the authorization, execute the following command:

    AMDProfilerService --clear-user <client_id>
    

6.6.6.3. Launching AMDProfilerService

Specify the binding IP address to launch AMDProfilerService as an application server:

AMDProfilerService --ip 127.0.0.1

This IP address should be one of the IP addresses of the target/remote machine on which AMDProfilerService is launched.

If target/remote machine has multiple IP addresses, the ping command can be used on the host/local machine to determine which IP address (of the remote machine) is reachable from the local machine. The reachable IP address can be passed to –ip option.

AMDProfilerService also has an interactive mode, to select the IP address. To launch the application server in interactive mode use.

AMDProfilerService

Then select the correct IP address.

You can also specify any of the following options:

Table 6.12 AMDProfilerService Options#

Option

Description

--ip <ip_address>

Specify the IP address.

Note

This is a required option.

--port<port_number>

Specify the port number.

--ipv6

Flag to enable IPv6 Networking.

--logpath<path>

Specify the log file path.

--bypass-auth

Skip the authorization.

Note

This option must be used with caution as it will skip the authorization.

--api-version

To print supported HTTP API version. .. note:: Use this to check compatibility with GUI.

--version | -v

Get the version information.

--add <client_id>

Add the Client ID.

--clear-user <client_id>

Remove a particular Client ID.

--clear-all

Remove all registered clients.

--fsearch-depth<depth>

Specify the maximum depth for recursive file search operations. .. note:: This option is applicable only for importing a session from the GUI.

--fsearch-timeout<timeout>

Specify the maximum duration (in seconds) for recursive file search operations.

Note

This option is applicable only for importing a session from the GUI.

--quiet

To skip any user prompt.

Example of a remote profiling connection establishment:

Connect to the remote machine and view the Client IDs.

Figure 6.8 Remote Profiling Connection Establishment#

Example of an IP selection:

Option to select IP address.

Figure 6.9 IP Address Selection#

6.6.6.4. Launching AMDProfilerService with IPv6

AMDProfilerService comes with support for IPv6 Networking Scheme. To enable IPv6 support, use the -ipv6 flag from the command line:

AMDProfilerService –ipv6 --ip fe80::6a05:caff:fe51:8a7f%enp97s0

Interactive mode is also supported for using IPv6 addressing. To use the interactive mode with IPv6 support run:

AMDProfilerService –ipv6
Option to select IP address.

Figure 6.10 IP Address Selection#

6.6.6.5. Connecting to a Remote Target

To connect the remote target:

  1. Once AMDProfilerService is launched on the remote target, go to the Remote Profile page and specify the IP address, port number, and optional name for the remote target as follows:

  2. Click Connect.

    The remote target data is displayed after a few seconds. All the profiling steps or importing session steps remain identical as local henceforth. After it is connected, the provided IP, port, and name are saved:

    Remote target data.

    Figure 6.11 Remote Target Data#

    Double-click on any table entry containing IP address to load the corresponding details and connect to the required remote target.

    After it is connected, the title bar will reflect the connection to the remote target. The Disconnect button in the Remote Profile page will be enabled.

    Disconnect button enabled.

    Figure 6.12 Disconnect Button Enabled#

6.6.6.6. Limitations

Here is a list of the limitations: