rsnt_translations
56,430
edits
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
<languages /> | <languages /> | ||
<translate> | <translate> | ||
<!--T:1--> | |||
Many programs are unable to fully use modern GPUs such as NVidia [https://www.nvidia.com/en-us/data-center/a100/ | Many programs are unable to fully use modern GPUs such as NVidia [https://www.nvidia.com/en-us/data-center/a100/ A100s] and [https://www.nvidia.com/en-us/data-center/h100/ H100s]. | ||
[https://www.nvidia.com/en-us/technologies/multi-instance-gpu/ Multi-Instance GPU (MIG)] is a technology that allows partitioning a single GPU into multiple [https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#terminology | [https://www.nvidia.com/en-us/technologies/multi-instance-gpu/ Multi-Instance GPU (MIG)] is a technology that allows partitioning a single GPU into multiple [https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#terminology instances], making each one a completely independent GPU. | ||
Each of the | Each of the GPU's instances gets a slice of the GPU's computational resources and memory, all detached from the other instances by on-chip protections. | ||
<!--T:2--> | <!--T:2--> | ||
GPU instances | Using instances of a GPU instances is less wasteful and usage is billed accordingly. Jobs submitted on such instances use less of your allocated priority compared to a full GPU; you will than be able to execute more jobs and have shorter wait time. | ||
= Which jobs should use GPU instances instead of full GPUs? = <!--T:3--> | |||
Jobs that use less than half of the computing power of a GPU and less than half of the available GPU memory should be evaluated and tested on a GPU instance. In most cases, these jobs will run just as fast on a GPU instance and consume less than half of the computing resource. | Jobs that use less than half of the computing power of a GPU and less than half of the available GPU memory should be evaluated and tested on a GPU instance. In most cases, these jobs will run just as fast on a GPU instance and consume less than half of the computing resource. | ||
Line 15: | Line 15: | ||
See the section [[#Finding_which_of_your_jobs_should_use_a_GPU_instance|Finding which of your jobs should use a GPU instance]] for more details. | See the section [[#Finding_which_of_your_jobs_should_use_a_GPU_instance|Finding which of your jobs should use a GPU instance]] for more details. | ||
=Limitations = <!--T:4--> | |||
[https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#app-considerations GPU instances do not support] the [https://developer.nvidia.com/docs/drive/drive-os/6.0.8.1/public/drive-os-linux-sdk/common/topics/nvsci_nvsciipc/Inter-ProcessCommunication1.html CUDA Inter-Process Communication (IPC)], which optimises data transfers between GPUs over NVLink and NVSwitch. | [https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html#app-considerations GPU instances do not support] the [https://developer.nvidia.com/docs/drive/drive-os/6.0.8.1/public/drive-os-linux-sdk/common/topics/nvsci_nvsciipc/Inter-ProcessCommunication1.html CUDA Inter-Process Communication (IPC)], which optimises data transfers between GPUs over NVLink and NVSwitch. | ||
This limitation also affects communications between GPU instances in a single GPU. | This limitation also affects communications between GPU instances in a single GPU. |