Multi-Instance GPU: Difference between revisions

Multi-Instance GPU (view source)

Revision as of 20:19, 10 September 2024

61 bytes removed , 1 month ago

no edit summary

Diane27

rsnt_translations

56,430

edits

@@ Line 1: / Line 1: @@
 <languages />
 <translate>
-= Introduction = <!--T:1-->
+<!--T:1-->
-Many programs are unable to fully use modern GPUs such as NVidia [https://www.nvidia.com/en-us/data-center/a100/ A100] and [https://www.nvidia.com/en-us/data-center/h100/ H100].
+Many programs are unable to fully use modern GPUs such as NVidia [https://www.nvidia.com/en-us/data-center/a100/ A100s] and [https://www.nvidia.com/en-us/data-center/h100/ H100s].
-[https://www.nvidia.com/en-us/technologies/multi-instance-gpu/ Multi-Instance GPU (MIG)] is a technology that allows partitioning a single GPU into multiple [https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#terminology GPU instances], thus making each ''instance'' a completely independent GPU.
+[https://www.nvidia.com/en-us/technologies/multi-instance-gpu/ Multi-Instance GPU (MIG)] is a technology that allows partitioning a single GPU into multiple [https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#terminology instances], making each one a completely independent GPU.
-Each of the multiple GPU instances would then have a certain slice of the GPU's computational resources and memory, all detached from the other instances by on-chip protections.
+Each of the GPU's instances gets a slice of the GPU's computational resources and memory, all detached from the other instances by on-chip protections.
 <!--T:2-->
-GPU instances can be less wasteful and their usage is billed accordingly. Jobs submitted on one of those instances will use less of your allocated priority compared to a full GPU. You will be able to execute more jobs and have shorter wait time.
+Using instances of a GPU instances is less wasteful and usage is billed accordingly. Jobs submitted on such instances use less of your allocated priority compared to a full GPU; you will than be able to execute more jobs and have shorter wait time.
-== Which jobs should use GPU instances instead of full GPUs? == <!--T:3-->
+= Which jobs should use GPU instances instead of full GPUs? = <!--T:3-->
 Jobs that use less than half of the computing power of a GPU and less than half of the available GPU memory should be evaluated and tested on a GPU instance. In most cases, these jobs will run just as fast on a GPU instance and consume less than half of the computing resource.
@@ Line 15: / Line 15: @@
 See the section [[#Finding_which_of_your_jobs_should_use_a_GPU_instance|Finding which of your jobs should use a GPU instance]] for more details.
-== Limitations == <!--T:4-->
+=Limitations = <!--T:4-->
 [https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#app-considerations GPU instances do not support] the [https://developer.nvidia.com/docs/drive/drive-os/6.0.8.1/public/drive-os-linux-sdk/common/topics/nvsci_nvsciipc/Inter-ProcessCommunication1.html CUDA Inter-Process Communication (IPC)], which optimises data transfers between GPUs over NVLink and NVSwitch.
 This limitation also affects communications between GPU instances in a single GPU.