rsnt_translations
56,430
edits
(Marked this version for translation) |
No edit summary |
||
Line 106: | Line 106: | ||
<!--T:19--> | <!--T:19--> | ||
Another way to monitor the usage of a running job is by [https://docs.alliancecan.ca/wiki/Running_jobs#Attaching_to_a_running_job attaching to the node] where the job is currently running and then by using <code>nvidia-smi</code> to read the GPU metrics in real time. | Another way to monitor the usage of a running job is by [https://docs.alliancecan.ca/wiki/Running_jobs#Attaching_to_a_running_job attaching to the node] where the job is currently running and then by using <code>nvidia-smi</code> to read the GPU metrics in real time. | ||
This will not provide maximum and average values for memory and power usage of the entire job, but it may be helpful to identify and troubleshoot | This will not provide maximum and average values for memory and power usage of the entire job, but it may be helpful to identify and troubleshoot underperforming jobs. | ||
</translate> | </translate> |