OpenACC Tutorial - Adding directives: Difference between revisions

Jump to navigation Jump to search
No edit summary
Line 291: Line 291:
<translate>
<translate>
The results are correct. However, not only do we not get any speed up, but we rather get a slow down by a factor of almost 4! Let's profile the code again using NVidia's visual profiler (<tt>nvvp</tt>). This can be done with the following steps:  
The results are correct. However, not only do we not get any speed up, but we rather get a slow down by a factor of almost 4! Let's profile the code again using NVidia's visual profiler (<tt>nvvp</tt>). This can be done with the following steps:  
# Start <tt>nvvp</tt> with the command <tt>nvvp &</tt>  (the <tt>&</tt> sign is to start it in the background
# Start <tt>nvvp</tt> with the command <tt>nvvp &</tt>  (the <tt>&</tt> sign is to start it in the background)
# Go in File -> New Session
# Go in File -> New Session
# In the "File:" field, search for the executable (named <tt>challenge</tt> in our example).
# In the "File:" field, search for the executable (named <tt>challenge</tt> in our example).
Bureaucrats, cc_docs_admin, cc_staff, rsnt_translations
2,837

edits

Navigation menu