Nvprof: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 35: Line 35:
This is the default operating mode for Nvprof. It outputs a single result line for each instruction such as  a kernel function or  CUDA memory copy/set performed by the application. For each kernel function, Nvprof outputs the total time of all instances of the kernel or type of memory copy as well as the average, minimum, and maximum time.
This is the default operating mode for Nvprof. It outputs a single result line for each instruction such as  a kernel function or  CUDA memory copy/set performed by the application. For each kernel function, Nvprof outputs the total time of all instances of the kernel or type of memory copy as well as the average, minimum, and maximum time.
In this example, the application is <code>a.out</code> and we run Nvprof to get the profiling :
In this example, the application is <code>a.out</code> and we run Nvprof to get the profiling :
{{Command|nvprof  ./a.out|Result
{{Command|nvprof  ./a.out|result =
[Matrix Multiply Using CUDA] - Starting...
[Matrix Multiply Using CUDA] - Starting...
==27694== NVPROF is profiling process 27694, command: matrixMul
==27694== NVPROF is profiling process 27694, command: matrixMul
Bureaucrats, cc_docs_admin, cc_staff
337

edits

Navigation menu