cc_staff
782
edits
(The Visual Profiler only works with CUDA C/C++ or OpenACC codes) |
(Showing nvprof first - nvvp will be moved to another page.) |
||
Line 25: | Line 25: | ||
== Build the Sample Code == <!--T:10--> | == Build the Sample Code == <!--T:10--> | ||
For this example we | For this example we use code from this [https://github.com/calculquebec/cq-formation-openacc Git repository]. | ||
Download the package and go to the <code>cpp</code> or the <code>f90</code> directory. | Download the package and go to the <code>cpp</code> or the <code>f90</code> directory. | ||
The object of this exercise is to compile and link the code, obtain an executable, and then profile it. | The object of this exercise is to compile and link the code, obtain an executable, and then profile it. | ||
Line 78: | Line 78: | ||
<translate> | <translate> | ||
<!--T:7--> | <!--T:7--> | ||
For the purpose of this tutorial, we use two profilers | For the purpose of this tutorial, we use two profilers: | ||
* NVIDIA Visual Profiler NVVP - a cross-platform analyzing tool for the codes written with OpenACC and CUDA C/C++ instructions | * NVPROF - a command line text-based profiler that can analyze non-GPU codes. | ||
* NVIDIA Visual Profiler NVVP - a graphical cross-platform analyzing tool for the codes written with OpenACC and CUDA C/C++ instructions. | |||
</translate> | </translate> | ||
}} | }} | ||
<translate> | |||
Since our previously built <code>cg.x</code> is not yet using the GPU, we will start the analysis with the <code>nvprof</code> profiler. | |||
=== NVIDIA NVPROF Command Line Profiler === <!--T:15--> | |||
NVIDIA also provides a command line version called NVPROF, similar to GPU prof | |||
</translate> | |||
{{Command | |||
|module load cuda/11.7 | |||
}} | |||
{{Command | |||
|nvprof --cpu-profiling on ./cg.x | |||
|result= | |||
... | |||
<Program output > | |||
... | |||
======== CPU profiling result (bottom up): | |||
Time(%) Time Name | |||
83.54% 90.6757s matvec(matrix const &, vector const &, vector const &) | |||
83.54% 90.6757s {{!}} main | |||
7.94% 8.62146s waxpby(double, vector const &, double, vector const &, vector const &) | |||
7.94% 8.62146s {{!}} main | |||
5.86% 6.36584s dot(vector const &, vector const &) | |||
5.86% 6.36584s {{!}} main | |||
2.47% 2.67666s allocate_3d_poisson_matrix(matrix&, int) | |||
2.47% 2.67666s {{!}} main | |||
0.13% 140.35ms initialize_vector(vector&, double) | |||
0.13% 140.35ms {{!}} main | |||
... | |||
======== Data collected at 100Hz frequency | |||
}} | |||
<translate> | <translate> | ||
=== NVIDIA Visual Profiler === <!--T:13--> | |||
=== NVIDIA Visual Profiler - (to be moved to another page) === <!--T:13--> | |||
[[File:Nvvp-pic0.png|thumbnail|300px|NVVP profiler|right]] | |||
[[File:Nvvp-pic1.png|thumbnail|300px|Browse for the executable you want to profile|right]] | |||
<!--T:14--> | <!--T:14--> | ||
Line 103: | Line 135: | ||
}} | }} | ||
<translate> | <translate> | ||
# After the NVVP startup window, you get prompted for a ''Workspace'' directory, which will be used for temporary files. Replace <code>home</code> with <code>scratch</code> in the suggested path. Then click ''OK''. | # After the NVVP startup window, you get prompted for a ''Workspace'' directory, which will be used for temporary files. Replace <code>home</code> with <code>scratch</code> in the suggested path. Then click ''OK''. | ||
Line 114: | Line 144: | ||
# Click ''Next >'' to review additional profiling options. | # Click ''Next >'' to review additional profiling options. | ||
# Click ''Finish'' to start profiling the executable. | # Click ''Finish'' to start profiling the executable. | ||
== Compiler Feedback == <!--T:16--> | == Compiler Feedback == <!--T:16--> |