Using cloud vGPUs: Difference between revisions
No edit summary |
(Marked this version for translation) |
||
Line 15: | Line 15: | ||
== Preparation of a VM running CentOS7 == <!--T:5--> | == Preparation of a VM running CentOS7 == <!--T:5--> | ||
<!--T:24--> | |||
Once the VM is available, make sure to update the OS to the latest available software, including the kernel and reboot the VM to have the latest kernel running. | Once the VM is available, make sure to update the OS to the latest available software, including the kernel and reboot the VM to have the latest kernel running. | ||
<pre> | <pre> | ||
Line 20: | Line 21: | ||
</pre> | </pre> | ||
<!--T:25--> | |||
Since the proprietary nvidia drivers need to be compiled against the running kernel, the package '''dkms''' is required from the [https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm EPEL Repository] | Since the proprietary nvidia drivers need to be compiled against the running kernel, the package '''dkms''' is required from the [https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm EPEL Repository] | ||
<!--T:26--> | |||
<pre> | <pre> | ||
[root@centos7]# yum -y install epel-release | [root@centos7]# yum -y install epel-release | ||
</pre> | </pre> | ||
<!--T:27--> | |||
Install the '''Arbutus Cloud''' [http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/arbutus-cloud-vgpu-repo.el7.noarch.rpm repository], it also installs the public key the package are signed with to ensure their authenticity, since these drivers and | Install the '''Arbutus Cloud''' [http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/arbutus-cloud-vgpu-repo.el7.noarch.rpm repository], it also installs the public key the package are signed with to ensure their authenticity, since these drivers and | ||
userspace tools are carefully tested first against the infrastructure, before they are made available. | userspace tools are carefully tested first against the infrastructure, before they are made available. | ||
Line 32: | Line 36: | ||
</pre> | </pre> | ||
<!--T:28--> | |||
The last step is to install the '''nvidia vGPU packages'''. The kernel module package 'nvidia-vgpu-kmod', will take a few minutes as it compiles the required kernel modules in the background. | The last step is to install the '''nvidia vGPU packages'''. The kernel module package 'nvidia-vgpu-kmod', will take a few minutes as it compiles the required kernel modules in the background. | ||
<pre> | <pre> | ||
Line 37: | Line 42: | ||
</pre> | </pre> | ||
<!--T:29--> | |||
After the successful installation, the vGPU is a now accessible and licensed. | After the successful installation, the vGPU is a now accessible and licensed. | ||
<pre> | <pre> | ||
Line 59: | Line 65: | ||
</pre> | </pre> | ||
<!--T:30--> | |||
To check for the license status as well as other information for the vGPU. | To check for the license status as well as other information for the vGPU. | ||
<pre> | <pre> | ||
[root@centos7]# nvidia-smi -q |less | [root@centos7]# nvidia-smi -q |less | ||
==============NVSMI LOG============== | ==============NVSMI LOG============== <!--T:31--> | ||
<!--T:32--> | |||
Timestamp : Mon Jun 1 16:06:59 2020 | Timestamp : Mon Jun 1 16:06:59 2020 | ||
Driver Version : 440.56 | Driver Version : 440.56 | ||
CUDA Version : 10.2 | CUDA Version : 10.2 | ||
<!--T:33--> | |||
Attached GPUs : 1 | Attached GPUs : 1 | ||
GPU 00000000:00:05.0 | GPU 00000000:00:05.0 | ||
Line 112: | Line 121: | ||
Sub System Id : 0x139610DE | Sub System Id : 0x139610DE | ||
<!--T:34--> | |||
</pre> | </pre> | ||
Line 120: | Line 130: | ||
</pre> | </pre> | ||
<!--T:35--> | |||
Since the proprietary nvidia drivers need to be compiled against the running kernel, the package '''dkms''' is required from the [https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm EPEL Repository] | Since the proprietary nvidia drivers need to be compiled against the running kernel, the package '''dkms''' is required from the [https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm EPEL Repository] | ||
<!--T:36--> | |||
<pre> | <pre> | ||
[root@centos8]# dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm | [root@centos8]# dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm | ||
</pre> | </pre> | ||
<!--T:37--> | |||
Install the '''Arbutus Cloud''' [http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/arbutus-cloud-vgpu-repo.el8.noarch.rpm repository], it also installs the public key the package are signed with to ensure their authenticity, since these drivers and | Install the '''Arbutus Cloud''' [http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/arbutus-cloud-vgpu-repo.el8.noarch.rpm repository], it also installs the public key the package are signed with to ensure their authenticity, since these drivers and | ||
userspace tools are carefully tested first against the infrastructure, before they are made available. | userspace tools are carefully tested first against the infrastructure, before they are made available. | ||
Line 132: | Line 145: | ||
</pre> | </pre> | ||
<!--T:38--> | |||
The last step is to install the '''nvidia vGPU packages'''. The kernel module package 'nvidia-vgpu-kmod', will take a few minutes as it compiles the required kernel modules in the background. | The last step is to install the '''nvidia vGPU packages'''. The kernel module package 'nvidia-vgpu-kmod', will take a few minutes as it compiles the required kernel modules in the background. | ||
<pre> | <pre> | ||
Line 137: | Line 151: | ||
</pre> | </pre> | ||
<!--T:39--> | |||
After the successful installation, the vGPU is a now accessible and licensed. | After the successful installation, the vGPU is a now accessible and licensed. | ||
To check on the status, the same '''nvidia-smi''' commands can be used as seen above for Centos7. | To check on the status, the same '''nvidia-smi''' commands can be used as seen above for Centos7. | ||
Line 143: | Line 158: | ||
Ensure that the latest packagesare installed and the system has been booted the latest stable kernel, as dkms will request the latest one available from the debian repositories. | Ensure that the latest packagesare installed and the system has been booted the latest stable kernel, as dkms will request the latest one available from the debian repositories. | ||
<!--T:40--> | |||
<pre> | <pre> | ||
root@debian10:~# apt-get update && apt-get -y dist-upgrade && reboot | root@debian10:~# apt-get update && apt-get -y dist-upgrade && reboot | ||
</pre> | </pre> | ||
<!--T:41--> | |||
After a successful reboot, the system should have the latest avaible kernel running and the repository can be installed, by installing the repo package. | After a successful reboot, the system should have the latest avaible kernel running and the repository can be installed, by installing the repo package. | ||
This package does also contain the gpg key all packages are signed with. | This package does also contain the gpg key all packages are signed with. | ||
<!--T:42--> | |||
<pre> | <pre> | ||
root@debian10:~# apt-get -y install gnupg | root@debian10:~# apt-get -y install gnupg | ||
Line 156: | Line 174: | ||
</pre> | </pre> | ||
<!--T:43--> | |||
The installation of the package will display a warning, since the key is directly imported (for convenience) via the packages post installation procedure. | The installation of the package will display a warning, since the key is directly imported (for convenience) via the packages post installation procedure. | ||
<!--T:44--> | |||
<pre> | <pre> | ||
Setting up arbutus-cloud-repo (0.1) ... | Setting up arbutus-cloud-repo (0.1) ... | ||
Line 164: | Line 184: | ||
</pre> | </pre> | ||
<!--T:45--> | |||
Update of the local apt cache and installation of the vGPU packages. | Update of the local apt cache and installation of the vGPU packages. | ||
<!--T:46--> | |||
<pre> | <pre> | ||
root@debian10:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd | root@debian10:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd | ||
</pre> | </pre> | ||
<!--T:47--> | |||
After the successful installation, the vGPU is a now accessible and licensed. To check on the status, the same '''nvidia-smi''' commands can be used as seen above for Centos7. | After the successful installation, the vGPU is a now accessible and licensed. To check on the status, the same '''nvidia-smi''' commands can be used as seen above for Centos7. | ||
Line 175: | Line 198: | ||
Ensure that the OS is up to date and all the latest patches are installed and the latest stable kernel is running. | Ensure that the OS is up to date and all the latest patches are installed and the latest stable kernel is running. | ||
<!--T:48--> | |||
<pre> | <pre> | ||
root@ubuntu20:~# apt-get update && apt-get -y dist-upgrade && reboot | root@ubuntu20:~# apt-get update && apt-get -y dist-upgrade && reboot | ||
</pre> | </pre> | ||
<!--T:49--> | |||
After a successful reboot, the system should have the latest avaible kernel running and the repository can be installed, by installing the repo package. This package does also contain the gpg key all packages are signed with. | After a successful reboot, the system should have the latest avaible kernel running and the repository can be installed, by installing the repo package. This package does also contain the gpg key all packages are signed with. | ||
<!--T:50--> | |||
<pre> | <pre> | ||
root@ubuntu20:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/ubuntu/pool/main/arbutus-cloud-repo_0.1ubuntu20_all.deb | root@ubuntu20:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/ubuntu/pool/main/arbutus-cloud-repo_0.1ubuntu20_all.deb | ||
Line 186: | Line 212: | ||
</pre> | </pre> | ||
<!--T:51--> | |||
The same warning will be displayed since the signature key is added via post install stage when the package is being installed and can be ignored. | The same warning will be displayed since the signature key is added via post install stage when the package is being installed and can be ignored. | ||
Update of the local apt cache and installation of the vGPU packages. | Update of the local apt cache and installation of the vGPU packages. | ||
Line 192: | Line 219: | ||
</pre> | </pre> | ||
<!--T:52--> | |||
After the successful installation, the vGPU is a now accessible and licensed. To check on the status, the same '''nvidia-smi''' commands can be used as seen above for Centos7. | After the successful installation, the vGPU is a now accessible and licensed. To check on the status, the same '''nvidia-smi''' commands can be used as seen above for Centos7. | ||
</translate> | </translate> |
Revision as of 16:44, 21 September 2021
This guide describes how to allocate vGPU resources to a virtual machine (VM), installing the necessary drivers and checking whether the vGPU can be used. Repository access as well as access to the vGPUs, is currently only available within Arbutus Cloud. Please note that the documentation below only covers the vGPU driver installation, the CUDA toolkit is not pre-installed. The CUDA toolkit can be installed directly from Nvidia or used from CVMFS software stack.
Supported flavors
To use a vGPU within a VM, the instance needs to be deployed on one of the flavors listed below. The vGPU will be available to the operating system via the PCI bus. While finalizing the setup for more vGPU profiles, the only flavor accessible right now is:
- g1-8gb-c4-22gb
Preparation of a VM running CentOS7
Once the VM is available, make sure to update the OS to the latest available software, including the kernel and reboot the VM to have the latest kernel running.
[root@centos7]# yum -y update && reboot
Since the proprietary nvidia drivers need to be compiled against the running kernel, the package dkms is required from the EPEL Repository
[root@centos7]# yum -y install epel-release
Install the Arbutus Cloud repository, it also installs the public key the package are signed with to ensure their authenticity, since these drivers and userspace tools are carefully tested first against the infrastructure, before they are made available.
[root@centos7]# yum -y install http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/arbutus-cloud-vgpu-repo.el7.noarch.rpm
The last step is to install the nvidia vGPU packages. The kernel module package 'nvidia-vgpu-kmod', will take a few minutes as it compiles the required kernel modules in the background.
[root@centos7]# yum -y install nvidia-vgpu-kmod nvidia-vgpu-gridd nvidia-vgpu-tools
After the successful installation, the vGPU is a now accessible and licensed.
[root@centos7]# nvidia-smi Mon Jun 1 16:03:27 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.56 Driver Version: 440.56 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GRID V100D-8C On | 00000000:00:05.0 Off | 0 | | N/A N/A P0 N/A / N/A | 560MiB / 8192MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
To check for the license status as well as other information for the vGPU.
[root@centos7]# nvidia-smi -q |less ==============NVSMI LOG============== Timestamp : Mon Jun 1 16:06:59 2020 Driver Version : 440.56 CUDA Version : 10.2 Attached GPUs : 1 GPU 00000000:00:05.0 Product Name : GRID V100D-8C Product Brand : Grid Display Mode : Enabled Display Active : Disabled Persistence Mode : Enabled Accounting Mode : Disabled Accounting Mode Buffer Size : 4000 Driver Model Current : N/A Pending : N/A Serial Number : N/A GPU UUID : GPU-315b585a-a41e-11ea-a63b-4ed0221b4f99 Minor Number : 0 VBIOS Version : 00.00.00.00.00 MultiGPU Board : No Board ID : 0x5 GPU Part Number : N/A Inforom Version Image Version : N/A OEM Object : N/A ECC Object : N/A Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A GPU Virtualization Mode Virtualization Mode : VGPU Host VGPU Mode : N/A GRID Licensed Product Product Name : NVIDIA vComputeServer License Status : Licensed IBMNPU Relaxed Ordering Mode : N/A PCI Bus : 0x00 Device : 0x05 Domain : 0x0000 Device Id : 0x1DB610DE Bus Id : 00000000:00:05.0 Sub System Id : 0x139610DE
Preparation of a VM running CentOS8
Once the VM is available, make sure to update the OS to the latest available software, including the kernel and reboot the VM to have the latest kernel running.
[root@centos8]# dnf -y update && reboot
Since the proprietary nvidia drivers need to be compiled against the running kernel, the package dkms is required from the EPEL Repository
[root@centos8]# dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
Install the Arbutus Cloud repository, it also installs the public key the package are signed with to ensure their authenticity, since these drivers and userspace tools are carefully tested first against the infrastructure, before they are made available.
[root@centos8]# dnf -y install http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/arbutus-cloud-vgpu-repo.el8.noarch.rpm
The last step is to install the nvidia vGPU packages. The kernel module package 'nvidia-vgpu-kmod', will take a few minutes as it compiles the required kernel modules in the background.
[root@centos8]# dnf -y install nvidia-vgpu-kmod nvidia-vgpu-gridd nvidia-vgpu-tools
After the successful installation, the vGPU is a now accessible and licensed. To check on the status, the same nvidia-smi commands can be used as seen above for Centos7.
Preparation of a VM running Debian10
Ensure that the latest packagesare installed and the system has been booted the latest stable kernel, as dkms will request the latest one available from the debian repositories.
root@debian10:~# apt-get update && apt-get -y dist-upgrade && reboot
After a successful reboot, the system should have the latest avaible kernel running and the repository can be installed, by installing the repo package. This package does also contain the gpg key all packages are signed with.
root@debian10:~# apt-get -y install gnupg root@debian10:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/debian/pool/main/arbutus-cloud-repo_0.1_all.deb root@debian10:~# dpkg -i arbutus-cloud-repo_0.1_all.deb
The installation of the package will display a warning, since the key is directly imported (for convenience) via the packages post installation procedure.
Setting up arbutus-cloud-repo (0.1) ... Warning: apt-key should not be used in scripts (called from postinst maintainerscript of the package arbutus-cloud-repo) OK
Update of the local apt cache and installation of the vGPU packages.
root@debian10:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
After the successful installation, the vGPU is a now accessible and licensed. To check on the status, the same nvidia-smi commands can be used as seen above for Centos7.
Preparation of a VM running Ubuntu20
Ensure that the OS is up to date and all the latest patches are installed and the latest stable kernel is running.
root@ubuntu20:~# apt-get update && apt-get -y dist-upgrade && reboot
After a successful reboot, the system should have the latest avaible kernel running and the repository can be installed, by installing the repo package. This package does also contain the gpg key all packages are signed with.
root@ubuntu20:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/ubuntu/pool/main/arbutus-cloud-repo_0.1ubuntu20_all.deb root@ubuntu20:~# dpkg -i arbutus-cloud-repo_0.1ubuntu20_all.deb
The same warning will be displayed since the signature key is added via post install stage when the package is being installed and can be ignored. Update of the local apt cache and installation of the vGPU packages.
root@ubuntu20:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
After the successful installation, the vGPU is a now accessible and licensed. To check on the status, the same nvidia-smi commands can be used as seen above for Centos7.