Using cloud vGPUs: Difference between revisions
No edit summary |
No edit summary |
||
Line 17: | Line 17: | ||
Once the VM is available, make sure to update the OS to the latest available software, including the kernel and reboot the VM to have the latest kernel running. | Once the VM is available, make sure to update the OS to the latest available software, including the kernel and reboot the VM to have the latest kernel running. | ||
<pre> | <pre> | ||
[root@ | [root@centos7]# yum -y update && reboot | ||
</pre> | </pre> | ||
Line 23: | Line 23: | ||
<pre> | <pre> | ||
[root@ | [root@centos7]# yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm | ||
</pre> | </pre> | ||
Line 29: | Line 29: | ||
userspace tools are carefully tested first against the infrastructure, before they are made available. | userspace tools are carefully tested first against the infrastructure, before they are made available. | ||
<pre> | <pre> | ||
[root@ | [root@centos7]# yum -y install http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/7/x86_64/Packages/a/arbutus-cloud-vgpu-repo-1.0-1.el7.noarch.rpm | ||
</pre> | </pre> | ||
Line 39: | Line 39: | ||
After the successful installation, the vGPU is a now accessible and licensed. | After the successful installation, the vGPU is a now accessible and licensed. | ||
<pre> | <pre> | ||
[root@ | [root@centos7]# nvidia-smi | ||
Mon Jun 1 16:03:27 2020 | Mon Jun 1 16:03:27 2020 | ||
+-----------------------------------------------------------------------------+ | +-----------------------------------------------------------------------------+ | ||
Line 62: | Line 62: | ||
<pre> | <pre> | ||
[root@ | [root@centos7]# nvidia-smi -q |less | ||
==============NVSMI LOG============== | ==============NVSMI LOG============== | ||
Revision as of 16:27, 1 June 2020
This is not a complete article: This is a draft, a work in progress that is intended to be published into an article, which may or may not be ready for inclusion in the main wiki. It should not necessarily be considered factual or authoritative.
This guide describes how to allocate vGPU resources to a virtual machine (VM), installing the necessary drivers and checking whether the vGPU can be used.
Supported flavors
To use a GPU within a VM, the instance needs to be deployed on one of the flavors listed below. The GPU will be available to the operating system via the PCI bus.
- vgpu1-c18-56gb
Preparation of a VM running CentOS7
Once the VM is available, make sure to update the OS to the latest available software, including the kernel and reboot the VM to have the latest kernel running.
[root@centos7]# yum -y update && reboot
Since the proprietary nvidia drivers need to be compiled against the running kernel, the package dkms is required from the EPEL Repository
[root@centos7]# yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
Install the Arbutus Cloud repository definition, it also installs the public key the package are signed with to ensure their authenticity, since these drivers and userspace tools are carefully tested first against the infrastructure, before they are made available.
[root@centos7]# yum -y install http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/7/x86_64/Packages/a/arbutus-cloud-vgpu-repo-1.0-1.el7.noarch.rpm
The last step is to install the nvidia vGPU packages. The kernel module package 'nvidia-vgpu-kmod', will take a few minutes as it compiles the required kernel modules in the background.
yum -y install nvidia-vgpu-kmod nvidia-vgpu-gridd nvidia-vgpu-tools
After the successful installation, the vGPU is a now accessible and licensed.
[root@centos7]# nvidia-smi Mon Jun 1 16:03:27 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.56 Driver Version: 440.56 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GRID V100D-8C On | 00000000:00:05.0 Off | 0 | | N/A N/A P0 N/A / N/A | 560MiB / 8192MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
To check for the license status as well as other information for the vGPU.
[root@centos7]# nvidia-smi -q |less ==============NVSMI LOG============== Timestamp : Mon Jun 1 16:06:59 2020 Driver Version : 440.56 CUDA Version : 10.2 Attached GPUs : 1 GPU 00000000:00:05.0 Product Name : GRID V100D-8C Product Brand : Grid Display Mode : Enabled Display Active : Disabled Persistence Mode : Enabled Accounting Mode : Disabled Accounting Mode Buffer Size : 4000 Driver Model Current : N/A Pending : N/A Serial Number : N/A GPU UUID : GPU-315b585a-a41e-11ea-a63b-4ed0221b4f99 Minor Number : 0 VBIOS Version : 00.00.00.00.00 MultiGPU Board : No Board ID : 0x5 GPU Part Number : N/A Inforom Version Image Version : N/A OEM Object : N/A ECC Object : N/A Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A GPU Virtualization Mode Virtualization Mode : VGPU Host VGPU Mode : N/A GRID Licensed Product Product Name : NVIDIA vComputeServer License Status : Licensed IBMNPU Relaxed Ordering Mode : N/A PCI Bus : 0x00 Device : 0x05 Domain : 0x0000 Device Id : 0x1DB610DE Bus Id : 00000000:00:05.0 Sub System Id : 0x139610DE