PGPROF/en: Difference between revisions

Updating to match new version of source page
(Updating to match new version of source page)
(Updating to match new version of source page)
Line 18: Line 18:
* pgi/17.3
* pgi/17.3


Use <code>module load pgi/version</code> to choose a version; for example, to load the PGI compiler version 17.3, use:
Use <code>module load pgi/version</code> to select a version; for example, to load the PGI compiler version 17.3, use
{{Command|module load pgi/17.3}}
{{Command|module load pgi/17.3}}


== Compile your code ==
== Compiling your code ==
To get useful information from PGPROF, you first need to compile your code with one of the PGI compilers (<code>pgcc</code> for C, <code>pgc++</code> for C++ , <code>pgfortran</code> for Fortran). A source in Fortran may need to be compiled with the <code>-g</code> flag.
To get useful information from PGPROF, you first need to compile your code with one of the PGI compilers (<code>pgcc</code> for C, <code>pgc++</code> for C++ , <code>pgfortran</code> for Fortran). A source in Fortran may need to be compiled with the <code>-g</code> flag.


Line 30: Line 30:
{{Command|pgprof -o a.prof ./a.out}}
{{Command|pgprof -o a.prof ./a.out}}


You can optionally save the data file and analyze it in graphical mode (see below) using ''File | Import''.
The data file can be analyzed in graphical mode with the ''File | Import'' command (see bewlow) or in command-line mode as follows.
<br><br>'''Analysis''': To visualize the performance data in command-line mode:
<br><br>'''Analysis''': To visualize the performance data in command-line mode:
{{Command|pgprof -i a.prof}}
{{Command|pgprof -i a.prof}}
Line 84: Line 84:
  }}
  }}


To find out what part of your application takes the longest time to run you can use the option <code>--cpu-profiling-mode bottom-up</code> which orients the call tree to show each function followed by functions that called it working backwards to main.
To find out what part of your application takes the longest time to run you can use the option <code>--cpu-profiling-mode bottom-up</code> which orients the call tree to show each function followed by functions that called it and working backwards to the main function.
{{Command|pgprof --cpu-profiling-mode bottom-up -i a.prof
{{Command|pgprof --cpu-profiling-mode bottom-up -i a.prof
|result=
|result=
Line 104: Line 104:


[[File:pgprof-start-session.png|thumbnail|300px|Starting a new PGPROF session (click for a larger image)|left  ]]
[[File:pgprof-start-session.png|thumbnail|300px|Starting a new PGPROF session (click for a larger image)|left  ]]
In graphical mode, both data collection and analysis can be accomplished in the same session. There are several steps that need to be done to collect and visualize performance data in this mode:
In graphical mode, both data collection and analysis can be accomplished in the same session most of the time. However, it is also possible to do the analysis from the pre-saved performance data file (e.g. collected in the command-line mode).
There are several steps that need to be done to collect and visualize performance data in this mode.<br><br>
'''Data collection'''
* Launch the PGI profiler.
* Launch the PGI profiler.
** Since the Pgrof's GUI is based on Java, it should be executed on the compute node in the interactive session rather than on the login node, as the latter does not have enough memory (see [[Java#Pitfalls|Java]] for more details). An interactive session can be started with <code>salloc --x11 ...</code> to enable X11 forwarding (see [[Running_jobs#Interactive_jobs|Interactive jobs]] for more details).  
** Since the Pgrof's GUI is based on Java, it should be executed on the compute node in the interactive session rather than on the login node, as the latter does not have enough memory (see [[Java#Pitfalls|Java]] for more details). An interactive session can be started with <code>salloc --x11 ...</code> to enable X11 forwarding (see [[Running_jobs#Interactive_jobs|Interactive jobs]] for more details).  
Line 110: Line 112:
* Select the executable file you want to profile and then add any arguments appropriate for your profiling.
* Select the executable file you want to profile and then add any arguments appropriate for your profiling.
* Click ''Next'', then ''Finish''.
* Click ''Next'', then ''Finish''.
'''Analysis'''
* In the ''CPU Details'' tab, click on the ''Show the top-down (callers first) call tree view'' button as shown in the figure below.
* In the ''CPU Details'' tab, click on the ''Show the top-down (callers first) call tree view'' button as shown in the figure below.


[[File:pgprof2.png|thumbnail|300px|Visualizing performance data (click for a larger image)|left  ]]
[[File:pgprof2.png|thumbnail|300px|Visualizing performance data (click for a larger image)|left  ]]
The visualization window is comprised of four panes:
The visualization window is comprised of four panes:
* The Timeline: shows all the events ordered by the time they executed
* The pane on the upper right shows the timeline with all the events ordered by the time at which they were executed.
* '''GPU Details''': shows performance details for the GPU kernels
* '''GPU Details''': shows performance details for the GPU kernels.
* '''CPU Details''': shows performance details for the CPU functions
* '''CPU Details''': shows performance details for the CPU functions.
* '''Properties''': shows all the details for a selected function in the timeline window
* '''Properties''': shows all the details for a selected function in the timeline window.
<br clear=all>
<br clear=all>


38,897

edits