Nvprof: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 29: Line 29:
== Compile your code ==
== Compile your code ==
To get useful information from Nvprof, you first need to compile your code with one of the Cuda compilers (<code>nvcc</code> for C).
To get useful information from Nvprof, you first need to compile your code with one of the Cuda compilers (<code>nvcc</code> for C).
== Profiling modes ==
Nvprof operates in one of the modes listed below.
=== Summary mode ===
This is the default operating mode for Nvprof. It outputs a single result line for each instruction such as  a kernel function or  CUDA memory copy/set performed by the application. For each kernel function, Nvprof outputs the total time of all instances of the kernel or type of memory copy as well as the average, minimum, and maximum time.
Bureaucrats, cc_docs_admin, cc_staff
337

edits