Bureaucrats, cc_docs_admin, cc_staff, rsnt_translations
2,837
edits
No edit summary |
(Marked this version for translation) |
||
Line 53: | Line 53: | ||
<translate> | <translate> | ||
<!--T:11--> | |||
After the executable is created, we are going to profile that code. | After the executable is created, we are going to profile that code. | ||
</translate> | </translate> | ||
Line 71: | Line 72: | ||
<translate> | <translate> | ||
=== PGPROF Profiler === | === PGPROF Profiler === <!--T:12--> | ||
[[File:Pgprof new0.png|thumbnail|300px|Starting a new PGPROF session|left ]] | [[File:Pgprof new0.png|thumbnail|300px|Starting a new PGPROF session|left ]] | ||
These next pictures demonstrate how to start with the PGPROF profiler. The first step is to initiate a new session. | These next pictures demonstrate how to start with the PGPROF profiler. The first step is to initiate a new session. | ||
Line 77: | Line 78: | ||
Finally, specify the profiling options; for example, if you need to profile CPU activity then click the "Profile execution of the CPU" box. | Finally, specify the profiling options; for example, if you need to profile CPU activity then click the "Profile execution of the CPU" box. | ||
=== NVIDIA Visual Profiler === | === NVIDIA Visual Profiler === <!--T:13--> | ||
<!--T:14--> | |||
Another profiler available for OpenACC applications is the NVIDIA Visual Profiler. It's a crossplatform analyzing tool for code written with OpenACC and CUDA C/C++ instructions. | Another profiler available for OpenACC applications is the NVIDIA Visual Profiler. It's a crossplatform analyzing tool for code written with OpenACC and CUDA C/C++ instructions. | ||
[[File:Nvvp-pic0.png|thumbnail|300px|NVVP profiler|right ]] | [[File:Nvvp-pic0.png|thumbnail|300px|NVVP profiler|right ]] | ||
[[File:Nvvp-pic1.png|thumbnail|300px|Browse for the executable you want to profile|right ]] | [[File:Nvvp-pic1.png|thumbnail|300px|Browse for the executable you want to profile|right ]] | ||
=== NVIDIA NVPROF Command Line Profiler === | === NVIDIA NVPROF Command Line Profiler === <!--T:15--> | ||
NVIDIA also provides a command line version called NVPROF, similar to GPU prof | NVIDIA also provides a command line version called NVPROF, similar to GPU prof | ||
</translate> | </translate> | ||
Line 104: | Line 106: | ||
}} | }} | ||
<translate> | <translate> | ||
== Compiler Feedback == | == Compiler Feedback == <!--T:16--> | ||
Before working on the routine, we need to understand what the compiler is actually doing by asking ourselves the following questions: | Before working on the routine, we need to understand what the compiler is actually doing by asking ourselves the following questions: | ||
* What optimizations were applied? | * What optimizations were applied? | ||
Line 110: | Line 112: | ||
* Can very minor modifications of the code affect performance? | * Can very minor modifications of the code affect performance? | ||
<!--T:17--> | |||
The PGI compiler offers you a '''-Minfo''' flag with the following options: | The PGI compiler offers you a '''-Minfo''' flag with the following options: | ||
* accel – Print compiler operations related to the accelerator | * accel – Print compiler operations related to the accelerator | ||
Line 116: | Line 119: | ||
* ccff–Add information to the object files for use by tools | * ccff–Add information to the object files for use by tools | ||
== How to Enable Compiler Feedback == | == How to Enable Compiler Feedback == <!--T:18--> | ||
* Edit the Makefile | * Edit the Makefile | ||
CXX=pgc++ | CXX=pgc++ | ||
Line 163: | Line 166: | ||
}} | }} | ||
<translate> | <translate> | ||
== Computational Intensity == | == Computational Intensity == <!--T:19--> | ||
Computational Intensity of a loop is a measure of how much work is being done compared to memory operations. | Computational Intensity of a loop is a measure of how much work is being done compared to memory operations. | ||
<!--T:20--> | |||
'''Computation Intensity = Compute Operations / Memory Operations''' | '''Computation Intensity = Compute Operations / Memory Operations''' | ||
<!--T:21--> | |||
Computational Intensity of 1.0 or greater suggests that the loop might run well on a GPU. | Computational Intensity of 1.0 or greater suggests that the loop might run well on a GPU. | ||
== Understanding the code == | == Understanding the code == <!--T:22--> | ||
Let's look closely at the following code: | Let's look closely at the following code: | ||
</translate> | </translate> | ||
Line 188: | Line 193: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
<translate> | <translate> | ||
<!--T:23--> | |||
Given the code above, we search for data dependencies: | Given the code above, we search for data dependencies: | ||
* Does one loop iteration affect other loop iterations? | * Does one loop iteration affect other loop iterations? | ||
Line 193: | Line 199: | ||
* Is sum a data dependency? No, it’s a reduction. | * Is sum a data dependency? No, it’s a reduction. | ||
<!--T:24--> | |||
[[OpenACC Tutorial - Adding directives|Onward to the next unit: Adding directives]]<br> | [[OpenACC Tutorial - Adding directives|Onward to the next unit: Adding directives]]<br> | ||
[[OpenACC Tutorial|Back to the lesson plan]] | [[OpenACC Tutorial|Back to the lesson plan]] | ||
</translate> | </translate> |