Using cloud vGPUs: Difference between revisions
No edit summary |
No edit summary |
||
Line 35: | Line 35: | ||
<!--T:27--> | <!--T:27--> | ||
Install the | Install the <b>Arbutus Cloud</b> [http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/arbutus-cloud-vgpu-repo.el7.noarch.rpm repository]. | ||
This also installs the public key the packages are signed with to ensure their authenticity | This also installs the public key the packages are signed with to ensure their authenticity. | ||
These drivers and user-space tools are carefully tested against the infrastructure before they are made available. | These drivers and user-space tools are carefully tested against the infrastructure before they are made available. | ||
<pre> | <pre> | ||
Line 43: | Line 43: | ||
<!--T:28--> | <!--T:28--> | ||
The last step is to install the | The last step is to install the <b>nvidia vGPU packages</b>. | ||
The kernel module package | The kernel module package <code>nvidia-vgpu-kmod</code> will take a few minutes as it compiles the required kernel modules in the background. | ||
<pre> | <pre> | ||
[root@centos7]# yum -y install nvidia-vgpu-kmod nvidia-vgpu-gridd nvidia-vgpu-tools | [root@centos7]# yum -y install nvidia-vgpu-kmod nvidia-vgpu-gridd nvidia-vgpu-tools | ||
Line 180: | Line 180: | ||
== Preparation of a VM running Debian11 == <!--T:64--> | == Preparation of a VM running Debian11 == <!--T:64--> | ||
Ensure that the latest packages are installed and the system has been booted with the latest stable kernel, as | Ensure that the latest packages are installed and the system has been booted with the latest stable kernel, as <b>dkms</b> will request the latest one available from the Debian repositories. | ||
<!--T:65--> | <!--T:65--> | ||
Line 206: | Line 206: | ||
== Preparation of a VM running Ubuntu22 == <!--T:70--> | == Preparation of a VM running Ubuntu22 == <!--T:70--> | ||
Ensure that the OS is up to date | Ensure that the OS is up to date, that all the latest patches are installed, and that the latest stable kernel is running. | ||
<!--T:71--> | <!--T:71--> | ||
Line 238: | Line 238: | ||
== Preparation of a VM running Ubuntu20 == <!--T:8--> | == Preparation of a VM running Ubuntu20 == <!--T:8--> | ||
Ensure that the OS is up to date | Ensure that the OS is up to date, that all the latest patches are installed, and that the latest stable kernel is running. | ||
<!--T:48--> | <!--T:48--> | ||
Line 268: | Line 268: | ||
== Preparation of a VM running Ubuntu18 == <!--T:57--> | == Preparation of a VM running Ubuntu18 == <!--T:57--> | ||
Ensure that the OS is up to date | Ensure that the OS is up to date, that all the latest patches are installed, and that the latest stable kernel is running. | ||
<!--T:58--> | <!--T:58--> |
Revision as of 21:25, 24 February 2023
This guide describes how to
- allocate virtual GPU (vGPU) resources to a virtual machine (VM),
- install the necessary drivers and
- check whether the vGPU can be used.
Access to repositories as well as to the vGPUs is currently only available within Arbutus Cloud. Please note that the documentation below only covers the vGPU driver installation; the CUDA toolkit is not pre-installed. The CUDA toolkit can be installed directly from Nvidia or used from the CVMFS software stack. If you choose to install the toolkit directly from Nvidia, please ensure that the vGPU driver is not overwritten with the one from the CUDA package.
Supported flavors
To use a vGPU within a VM, the instance needs to be deployed on one of the flavors listed below. The vGPU will be available to the operating system via the PCI bus.
- g1-8gb-c4-22gb
Preparation of a VM running CentOS7
Once the VM is available, make sure to update the OS to the latest available software, including the kernel. Then reboot the VM to have the latest kernel running.
[root@centos7]# yum -y update && reboot
Since the proprietary Nvidia drivers need to be compiled against the running kernel, the dkms package is required from the EPEL Repository
[root@centos7]# yum -y install epel-release
Install the Arbutus Cloud repository. This also installs the public key the packages are signed with to ensure their authenticity. These drivers and user-space tools are carefully tested against the infrastructure before they are made available.
[root@centos7]# yum -y install http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/arbutus-cloud-vgpu-repo.el7.noarch.rpm
The last step is to install the nvidia vGPU packages.
The kernel module package nvidia-vgpu-kmod
will take a few minutes as it compiles the required kernel modules in the background.
[root@centos7]# yum -y install nvidia-vgpu-kmod nvidia-vgpu-gridd nvidia-vgpu-tools
If your installation was successful, the vGPU will be accessible and licensed.
Test by running nvidia-smi
:
[root@centos7]# nvidia-smi Tue Sep 21 17:40:33 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.91.03 Driver Version: 460.91.03 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GRID V100D-8C On | 00000000:00:05.0 Off | N/A | | N/A N/A P0 N/A / N/A | 560MiB / 8192MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
To check for the license status as well as other information about the vGPU:
[root@centos7]# nvidia-smi -q |less ==============NVSMI LOG============== Timestamp : Tue Sep 21 17:41:48 2021 Driver Version : 460.91.03 CUDA Version : 11.2 Attached GPUs : 1 GPU 00000000:00:05.0 Product Name : GRID V100D-8C Product Brand : NVIDIA Virtual Compute Server Display Mode : Enabled Display Active : Disabled Persistence Mode : Enabled MIG Mode Current : N/A Pending : N/A Accounting Mode : Disabled Accounting Mode Buffer Size : 4000 Driver Model Current : N/A Pending : N/A Serial Number : N/A GPU UUID : GPU-c6d5d6c1-1b00-11ec-b031-a89a79e5169c Minor Number : 0 VBIOS Version : 00.00.00.00.00 MultiGPU Board : No Board ID : 0x5 GPU Part Number : N/A Inforom Version Image Version : N/A OEM Object : N/A ECC Object : N/A Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A GPU Virtualization Mode Virtualization Mode : VGPU Host VGPU Mode : N/A vGPU Software Licensed Product Product Name : NVIDIA Virtual Compute Server License Status : Licensed IBMNPU Relaxed Ordering Mode : N/A PCI Bus : 0x00 Device : 0x05 Domain : 0x0000 Device Id : 0x1DB610DE Bus Id : 00000000:00:05.0
Preparation of a VM running Debian10
Ensure that the latest packages are installed and the system has been booted with the latest stable kernel, as dkms will request the latest one available from the Debian repositories.
root@debian10:~# apt-get update && apt-get -y dist-upgrade && reboot
After a successful reboot, the system should have the latest available kernel running and the repository can be installed, by installing the arbutus-cloud-repo
package.
This package also contains the gpg key all packages are signed with.
root@debian10:~# apt-get -y install gnupg root@debian10:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/debian/pool/main/arbutus-cloud-repo_0.1_all.deb root@debian10:~# dpkg -i arbutus-cloud-repo_0.1_all.deb
The installation of the package will display a warning, since the key is directly imported (for convenience) via the package's post-installation procedure.
Setting up arbutus-cloud-repo (0.1) ... Warning: apt-key should not be used in scripts (called from postinst maintainerscript of the package arbutus-cloud-repo) OK
Update the local apt cache and install the vGPU packages:
root@debian10:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
If your installation was successful, the vGPU will be accessible and licensed.
Test by running nvidia-smi
as shown above for Centos7.
Preparation of a VM running Debian11
Ensure that the latest packages are installed and the system has been booted with the latest stable kernel, as dkms will request the latest one available from the Debian repositories.
root@debian11:~# apt-get update && apt-get -y dist-upgrade && reboot
After a successful reboot, the system should have the latest available kernel running and the repository can be installed, by installing the arbutus-cloud-repo
package.
This package also contains the gpg key all packages are signed with.
root@debian11:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/debian11/pool/main/arbutus-cloud-repo_0.1_all.deb root@debian11:~# apt-get install -y ./arbutus-cloud-repo_0.1_all.deb
Update the local apt cache and install the vGPU packages:
root@debian11:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
Preparation of a VM running Ubuntu22
Ensure that the OS is up to date, that all the latest patches are installed, and that the latest stable kernel is running.
root@ubuntu22:~# apt-get update && apt-get -y dist-upgrade && reboot
After a successful reboot, the system should have the latest available kernel running.
Now the repository can be installed by installing the arbutus-cloud-repo
package.
This package also contains the gpg key all packages are signed with.
root@ubuntu22:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/ubuntu22/pool/main/arbutus-cloud-repo_0.1_all.deb root@ubuntu22:~# dpkg -i arbutus-cloud-repo_0.1_all.deb
A warning will be displayed since the signature key is added in the post-install stage. The warning can be ignored. Update the local apt cache and install the vGPU packages:
root@ubuntu22:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
If your installation was successful, the vGPU will be accessible and licensed.
Test by running nvidia-smi
as shown above for Centos7.
Preparation of a VM running Ubuntu20
Ensure that the OS is up to date, that all the latest patches are installed, and that the latest stable kernel is running.
root@ubuntu20:~# apt-get update && apt-get -y dist-upgrade && reboot
After a successful reboot, the system should have the latest available kernel running.
Now the repository can be installed by installing the arbutus-cloud-repo
package.
This package also contains the gpg key all packages are signed with.
root@ubuntu20:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/ubuntu/pool/main/arbutus-cloud-repo_0.1ubuntu20_all.deb root@ubuntu20:~# dpkg -i arbutus-cloud-repo_0.1ubuntu20_all.deb
A warning will be displayed since the signature key is added in the post-install stage. The warning can be ignored. Update the local apt cache and install the vGPU packages:
root@ubuntu20:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
If your installation was successful, the vGPU will be accessible and licensed.
Test by running nvidia-smi
as shown above for Centos7.
Preparation of a VM running Ubuntu18
Ensure that the OS is up to date, that all the latest patches are installed, and that the latest stable kernel is running.
root@ubuntu18:~# apt-get update && apt-get -y dist-upgrade && reboot
After a successful reboot, the system should have the latest available kernel running.
Now the repository can be installed by installing the arbutus-cloud-repo
package.
This package also contains the gpg key all packages are signed with.
root@ubuntu18:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/ubuntu18/pool/main/arbutus-cloud-repo_0.1ubuntu18_all.deb root@ubuntu18:~# dpkg -i arbutus-cloud-repo_0.1ubuntu18_all.deb
A warning will be displayed since the signature key is added in the post-install stage. The warning can be ignored. Update the local apt cache and install the vGPU packages:
root@ubuntu18:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
If your installation was successful, the vGPU will be accessible and licensed.
Test by running nvidia-smi
as shown above for Centos7.