Bureaucrats, cc_docs_admin, cc_staff, rsnt_translations
2,837
edits
No edit summary |
No edit summary |
||
Line 23: | Line 23: | ||
What is so important about hotspots in the code ? | What is so important about hotspots in the code ? | ||
Amdahl's law says that "Parallelizing the most time-consuming routines (i.e. the hotspots) will have the most impact". | Amdahl's law says that "Parallelizing the most time-consuming routines (i.e. the hotspots) will have the most impact". | ||
== Build the Sample Code == <!--T:10--> | == Build the Sample Code == <!--T:10--> | ||
For this example we will use code from the [https://github.com/calculquebec/cq-formation-openacc repositories]. Download the package and change to the '''cpp''' or '''f90''' directory. The object of this exercise is to compile and link the code, obtain an executable, and then profile it. | For this example we will use code from the [https://github.com/calculquebec/cq-formation-openacc repositories]. Download the package and change to the '''cpp''' or '''f90''' directory. The object of this exercise is to compile and link the code, obtain an executable, and then profile it. | ||
Line 53: | Line 52: | ||
}} | }} | ||
<translate> | |||
After the executable is created, we are going to profile that code. | After the executable is created, we are going to profile that code. | ||
</translate> | |||
{{Callout | {{Callout | ||
|title=<translate><!--T:6--> | |title=<translate><!--T:6--> | ||
Line 68: | Line 70: | ||
}} | }} | ||
<translate> | |||
=== PGPROF Profiler === | === PGPROF Profiler === | ||
[[File:Pgprof new0.png|thumbnail|300px|Starting a new PGPROF session|left ]] | [[File:Pgprof new0.png|thumbnail|300px|Starting a new PGPROF session|left ]] | ||
Line 83: | Line 85: | ||
=== NVIDIA NVPROF Command Line Profiler === | === NVIDIA NVPROF Command Line Profiler === | ||
NVIDIA also provides a command line version called NVPROF, similar to GPU prof | NVIDIA also provides a command line version called NVPROF, similar to GPU prof | ||
</translate> | |||
{{Command | {{Command | ||
|nvprof --cpu-profiling on ./cgi.x | |nvprof --cpu-profiling on ./cgi.x | ||
Line 100: | Line 103: | ||
======== Data collected at 100Hz frequency | ======== Data collected at 100Hz frequency | ||
}} | }} | ||
<translate> | |||
== Compiler Feedback == | == Compiler Feedback == | ||
Before working on the routine, we need to understand what the compiler is actually doing by asking ourselves the following questions: | Before working on the routine, we need to understand what the compiler is actually doing by asking ourselves the following questions: | ||
Line 118: | Line 121: | ||
CXXFLAGS=-fast -Minfo=all,intensity,ccff LDFLAGS=${CXXFLAGS} | CXXFLAGS=-fast -Minfo=all,intensity,ccff LDFLAGS=${CXXFLAGS} | ||
* Rebuild | * Rebuild | ||
</translate> | |||
{{Command | {{Command | ||
|make | |make | ||
Line 158: | Line 162: | ||
pgc++ CXXFLAGS=-fast -Minfo=all,intensity,ccff LDFLAGS=-fast main.o -o cg.x -fast | pgc++ CXXFLAGS=-fast -Minfo=all,intensity,ccff LDFLAGS=-fast main.o -o cg.x -fast | ||
}} | }} | ||
<translate> | |||
== Computational Intensity == | == Computational Intensity == | ||
Computational Intensity of a loop is a measure of how much work is being done compared to memory operations. | Computational Intensity of a loop is a measure of how much work is being done compared to memory operations. | ||
Line 168: | Line 172: | ||
== Understanding the code == | == Understanding the code == | ||
Let's look closely at the following code: | Let's look closely at the following code: | ||
</translate> | |||
<syntaxhighlight lang="cpp" line highlight="1,5,10,12"> | <syntaxhighlight lang="cpp" line highlight="1,5,10,12"> | ||
for(int i=0;i<num_rows;i++) { | for(int i=0;i<num_rows;i++) { | ||
Line 182: | Line 187: | ||
} | } | ||
</syntaxhighlight> | </syntaxhighlight> | ||
<translate> | |||
Given the code above, we search for data dependencies: | Given the code above, we search for data dependencies: | ||
* Does one loop iteration affect other loop iterations? | * Does one loop iteration affect other loop iterations? | ||
Line 189: | Line 195: | ||
[[OpenACC Tutorial - Adding directives|Onward to the next unit: Adding directives]]<br> | [[OpenACC Tutorial - Adding directives|Onward to the next unit: Adding directives]]<br> | ||
[[OpenACC Tutorial|Back to the lesson plan]] | [[OpenACC Tutorial|Back to the lesson plan]] | ||
</translate> |