Nvprof: Difference between revisions

587 bytes added ,  1 year ago
Disabling DCGM
(add LD_LIBRARY_PATH command)
(Disabling DCGM)
 
Line 13: Line 13:


= Quickstart guide =
= Quickstart guide =
On [[Béluga/en|Béluga]] and [[Narval/en|Narval]], the
[https://developer.nvidia.com/dcgm NVIDIA Data Center GPU Manager (DCGM)]
needs to be disabled, and this must be done while doing your job submission:
[name@server ~]$ DISABLE_DCGM=1 salloc --gres=gpu:1 ...
When your job starts, DCGM will eventually stop running in the following minute.
For convenience, the following loop awaits until the monitoring service has stopped
(that is as soon as <code>grep</code> returns nothing):
[name@server ~]$ while [ ! -z "$(dcgmi -v | grep 'Hostengine build info:')" ]; do sleep 5; done
== Environment modules == <!--T:3-->
== Environment modules == <!--T:3-->
Before you start profiling with NVPROF, the appropriate [[Utiliser des modules/en|module]] needs to be loaded.   
Before you start profiling with NVPROF, the appropriate [[Utiliser des modules/en|module]] needs to be loaded.   
cc_staff
782

edits