rsnt_translations
56,430
edits
(Created page with "Cliquez pour agrandir.") |
(Created page with "Les résultats sont corrects, toutefois, loin de gagner en vitesse, l'opération a pris près de quatre fois plus de temps! Utilisons le NVIDIA Visual Profiler (<tt>nvvp</tt>...") |
||
Line 260: | Line 260: | ||
}} | }} | ||
[[File:Openacc profiling1.png|thumbnail|Cliquez pour agrandir.]] | [[File:Openacc profiling1.png|thumbnail|Cliquez pour agrandir.]] | ||
Les résultats sont corrects, toutefois, loin de gagner en vitesse, l'opération a pris près de quatre fois plus de temps! Utilisons le NVIDIA Visual Profiler (<tt>nvvp</tt>) pour voir ce qui se passe. | |||
# | # Démarrez <tt>nvvp</tt> avec la commande <tt>nvvp &</tt> , où le symbole <tt>&</tt> permet de démarrer en arrière-plan. | ||
# | # Sélectionnez ''File -> New Session''. | ||
# | # Dans le champ "File:", cherchez l'exécutable; dans notre exemple, nous utilisons <tt>challenge</tt> . | ||
# | # Cliquez sur "Next" jusqu'à ce que vous puissiez cliquer sur "Finish". | ||
This will run the program and generate a timeline of the execution. The resulting timeline is illustrated on the image on the right side. As we can see, almost all of the run time is being spent transferring data between the host and the device. This is very often the case when one ports a code from CPU to GPU. We will look at how to optimize this in the [[OpenACC Tutorial - Data movement|next part of the tutorial]]. | This will run the program and generate a timeline of the execution. The resulting timeline is illustrated on the image on the right side. As we can see, almost all of the run time is being spent transferring data between the host and the device. This is very often the case when one ports a code from CPU to GPU. We will look at how to optimize this in the [[OpenACC Tutorial - Data movement|next part of the tutorial]]. |