OpenACC Tutorial - Profiling: Difference between revisions

Jump to navigation Jump to search
Marked this version for translation
No edit summary
(Marked this version for translation)
Line 53: Line 53:


<translate>
<translate>
<!--T:11-->
After the executable is created, we are going to profile that code.
After the executable is created, we are going to profile that code.
</translate>
</translate>
Line 71: Line 72:


<translate>
<translate>
=== PGPROF Profiler  ===
=== PGPROF Profiler  === <!--T:12-->
[[File:Pgprof new0.png|thumbnail|300px|Starting a new PGPROF session|left  ]]
[[File:Pgprof new0.png|thumbnail|300px|Starting a new PGPROF session|left  ]]
These next pictures demonstrate how to start with the PGPROF profiler. The first step is to initiate a new session.  
These next pictures demonstrate how to start with the PGPROF profiler. The first step is to initiate a new session.  
Line 77: Line 78:
Finally, specify the profiling options; for example, if you need to profile CPU activity then click the "Profile execution of the CPU" box.
Finally, specify the profiling options; for example, if you need to profile CPU activity then click the "Profile execution of the CPU" box.


=== NVIDIA Visual Profiler  ===
=== NVIDIA Visual Profiler  === <!--T:13-->


<!--T:14-->
Another profiler available for OpenACC applications is the NVIDIA Visual Profiler. It's a crossplatform analyzing tool for code written with OpenACC and CUDA C/C++ instructions.
Another profiler available for OpenACC applications is the NVIDIA Visual Profiler. It's a crossplatform analyzing tool for code written with OpenACC and CUDA C/C++ instructions.
[[File:Nvvp-pic0.png|thumbnail|300px|NVVP profiler|right  ]]
[[File:Nvvp-pic0.png|thumbnail|300px|NVVP profiler|right  ]]
[[File:Nvvp-pic1.png|thumbnail|300px|Browse for the executable you want to profile|right  ]]
[[File:Nvvp-pic1.png|thumbnail|300px|Browse for the executable you want to profile|right  ]]


=== NVIDIA NVPROF Command Line Profiler  ===
=== NVIDIA NVPROF Command Line Profiler  === <!--T:15-->
NVIDIA also provides a command line version called NVPROF, similar to GPU prof
NVIDIA also provides a command line version called NVPROF, similar to GPU prof
</translate>
</translate>
Line 104: Line 106:
}}
}}
<translate>
<translate>
== Compiler Feedback  ==
== Compiler Feedback  == <!--T:16-->
Before working on the routine, we need to understand what the compiler is actually doing by asking ourselves the following questions:
Before working on the routine, we need to understand what the compiler is actually doing by asking ourselves the following questions:
* What optimizations were applied?  
* What optimizations were applied?  
Line 110: Line 112:
* Can very minor modifications of the code affect performance?
* Can very minor modifications of the code affect performance?


<!--T:17-->
The PGI compiler offers you a '''-Minfo''' flag with the following options:
The PGI compiler offers you a '''-Minfo''' flag with the following options:
* accel – Print compiler operations related to the accelerator
* accel – Print compiler operations related to the accelerator
Line 116: Line 119:
* ccff–Add information to the object files for use by tools
* ccff–Add information to the object files for use by tools


== How to Enable Compiler Feedback  ==
== How to Enable Compiler Feedback  == <!--T:18-->
* Edit the Makefile
* Edit the Makefile
CXX=pgc++
CXX=pgc++
Line 163: Line 166:
}}
}}
<translate>
<translate>
== Computational Intensity  ==
== Computational Intensity  == <!--T:19-->
Computational Intensity of a loop is a measure of how much work is being done compared to memory operations.
Computational Intensity of a loop is a measure of how much work is being done compared to memory operations.


<!--T:20-->
'''Computation Intensity = Compute Operations / Memory Operations'''
'''Computation Intensity = Compute Operations / Memory Operations'''


<!--T:21-->
Computational Intensity of 1.0 or greater suggests that the loop might run well on a GPU.
Computational Intensity of 1.0 or greater suggests that the loop might run well on a GPU.


== Understanding the code  ==
== Understanding the code  == <!--T:22-->
Let's look closely at the following code:
Let's look closely at the following code:
</translate>
</translate>
Line 188: Line 193:
</syntaxhighlight>  
</syntaxhighlight>  
<translate>
<translate>
<!--T:23-->
Given the code above, we search for data dependencies:
Given the code above, we search for data dependencies:
* Does one loop iteration affect other loop iterations?
* Does one loop iteration affect other loop iterations?
Line 193: Line 199:
* Is sum a data dependency? No, it’s a reduction.
* Is sum a data dependency? No, it’s a reduction.


<!--T:24-->
[[OpenACC Tutorial - Adding directives|Onward to the next unit: Adding directives]]<br>
[[OpenACC Tutorial - Adding directives|Onward to the next unit: Adding directives]]<br>
[[OpenACC Tutorial|Back to the lesson plan]]
[[OpenACC Tutorial|Back to the lesson plan]]
</translate>
</translate>
Bureaucrats, cc_docs_admin, cc_staff, rsnt_translations
2,837

edits

Navigation menu