Multi-Instance GPU: Difference between revisions

Jump to navigation Jump to search
no edit summary
(Marked this version for translation)
No edit summary
Line 106: Line 106:
<!--T:19-->
<!--T:19-->
Another way to monitor the usage of a running job is by [https://docs.alliancecan.ca/wiki/Running_jobs#Attaching_to_a_running_job attaching to the node] where the job is currently running and then by using <code>nvidia-smi</code> to read the GPU metrics in real time.
Another way to monitor the usage of a running job is by [https://docs.alliancecan.ca/wiki/Running_jobs#Attaching_to_a_running_job attaching to the node] where the job is currently running and then by using <code>nvidia-smi</code> to read the GPU metrics in real time.
This will not provide maximum and average values for memory and power usage of the entire job, but it may be helpful to identify and troubleshoot under-performing jobs.
This will not provide maximum and average values for memory and power usage of the entire job, but it may be helpful to identify and troubleshoot underperforming jobs.
</translate>
</translate>
rsnt_translations
56,430

edits

Navigation menu