Using cloud vGPUs: Difference between revisions

Jump to navigation Jump to search
update to recent OS versions
No edit summary
(update to recent OS versions)
Line 1: Line 1:
<languages />
<languages />
<translate>
<translate>
<!--T:2-->
 
This page describes how to  
This page describes how to  
* allocate virtual GPU (vGPU) resources to a virtual machine (VM),  
* allocate virtual GPU (vGPU) resources to a virtual machine (VM),  
Line 9: Line 9:
If you choose to install the toolkit directly from NVIDIA, please ensure that the vGPU driver is not overwritten with the one from the CUDA package.
If you choose to install the toolkit directly from NVIDIA, please ensure that the vGPU driver is not overwritten with the one from the CUDA package.


== Supported flavors == <!--T:23-->
== Supported flavors ==


<!--T:3-->
<!--T:3-->
Line 18: Line 18:
* g1-16gb-c8-40gb
* g1-16gb-c8-40gb


== Preparation of a VM running Almalinux9 == <!--T:76-->
== Preparation of a VM running AlmaLinux 9 ==  


<!--T:77-->
Once the VM is available, make sure to update the OS to the latest available software, including the kernel.
Once the VM is available, make sure to update the OS to the latest available software, including the kernel.
Then, reboot the VM to have the latest kernel running.
Then, reboot the VM to have the latest kernel running.
Line 26: Line 25:
To have access to the [https://en.wikipedia.org/wiki/Dynamic_Kernel_Module_Support DKMS package], the [https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm EPEL repository] is required.
To have access to the [https://en.wikipedia.org/wiki/Dynamic_Kernel_Module_Support DKMS package], the [https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm EPEL repository] is required.


Almalinux9 has per default a faulty nouveau driver which crashes the kernel as soon as the nvidia driver is mounted, the VM needs a few extra steps to prevent
AlmaLinux 9 has by default a faulty <code>nouveau</code> driver which crashes the kernel as soon as the <code>nvidia</code> driver is mounted.
the loading of the nouveau driver when the system boots.
The VM needs a few extra steps to prevent the loading of the nouveau driver when the system boots.


</translate>
<pre>
<pre>
[root@almalinux9]# echo -e "blacklist nouveau\noptions nouveau modeset=0" >/etc/modprobe.d/blacklist-nouveau.conf
[root@almalinux9]# echo -e "blacklist nouveau\noptions nouveau modeset=0" >/etc/modprobe.d/blacklist-nouveau.conf
Line 34: Line 34:
[root@almalinux9]# dnf -y update && dnf -y install epel-release && reboot
[root@almalinux9]# dnf -y update && dnf -y install epel-release && reboot
</pre>
</pre>
<translate>


<!--T:78-->
After the reboot of the VM, the Arbutus vGPU Cloud repository needs to be installed.  
After the reboot of the VM, the Arbutus vGPU Cloud repository needs to be installed.  


<!--T:79-->
</translate>
<pre>
<pre>
[root@almalinux9]# dnf install http://repo.arbutus.cloud.computecanada.ca/pulp/repos/alma9/Packages/a/arbutus-cloud-vgpu-repo-1.0-1.el9.noarch.rpm</pre>
[root@almalinux9]# dnf install http://repo.arbutus.cloud.computecanada.ca/pulp/repos/alma9/Packages/a/arbutus-cloud-vgpu-repo-1.0-1.el9.noarch.rpm</pre>
<translate>


<!--T:80-->
The next step is to install the vGPU packages, which will install the required driver and user-space tools.
The next step is to install the vGPU packages, which will install the required driver and user-space tools.


<!--T:81-->
</translate>
<pre>
<pre>
[root@almalinux9]# dnf -y install nvidia-vgpu-gridd.x86_64 nvidia-vgpu-tools.x86_64 nvidia-vgpu-kmod.x86_64
[root@almalinux9]# dnf -y install nvidia-vgpu-gridd.x86_64 nvidia-vgpu-tools.x86_64 nvidia-vgpu-kmod.x86_64
</pre>
</pre>
<translate>


<!--T:82-->
After a successful  installation, <code>nvidia-smi</code> can be used to verify the proper functionality.
After a successful  installation, <b>nvidia-smi</b> can be used to verify the proper functionality.


<!--T:83-->
</translate>
<pre>
<pre>
[root@almalinux9]# nvidia-smi  
[root@almalinux9]# nvidia-smi  
Line 77: Line 77:
+-----------------------------------------------------------------------------------------+
+-----------------------------------------------------------------------------------------+
</pre>
</pre>
<translate>


<!-- ============ UPDATE PREPARATION DO NOT PUBLISH YET! ========================== -->
== Preparation of a VM running AlmaLinux 8 ==  
<!--
== Preparation of a VM running Almalinux8 == <!--T:76-->


<!--T:77-->
Once the VM is available, make sure to update the OS to the latest available software, including the kernel. Then, reboot the VM to have the latest kernel running.
Once the VM is available, make sure to update the OS to the latest available software, including the kernel. Then, reboot the VM to have the latest kernel running.
To have access to the [https://en.wikipedia.org/wiki/Dynamic_Kernel_Module_Support DKMS package], the [https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm EPEL repository] is required.
To have access to the [https://en.wikipedia.org/wiki/Dynamic_Kernel_Module_Support DKMS package], the EPEL repository is required.
<pre>
[root@vgpu almalinux]# dnf -y update && dnf -y install epel-release && reboot
</pre>


<!--T:78-->
</translate>
After the reboot of the VM, the [http://repo.arbutus.cloud.computecanada.ca/pulp/repos/alma8/Packages/a/arbutus-cloud-vgpu-repo-1.0-1.el8.noarch.rpm Arbutus Cloud repository]
needs to be installed.
 
<!--T:79-->
<pre>
[root@almalinux8]# dnf install http://repo.arbutus.cloud.computecanada.ca/pulp/repos/alma8/Packages/a/arbutus-cloud-vgpu-repo-1.0-1.el8.noarch.rpm
</pre>
 
<!--T:80-->
The next step is to install the vGPU packages, which will install the required driver and user-space tools.
 
<!--T:81-->
<pre>
[root@vgpu almalinux]# dnf -y install nvidia-vgpu-gridd.x86_64 nvidia-vgpu-tools.x86_64 nvidia-vgpu-kmod.x86_64
</pre>
 
<!--T:82-->
After a successful  installation, <b>nvidia-smi</b> can be used to verify the proper functionality.
 
<!--T:83-->
<pre>
[root@almalinux8]# nvidia-smi
Tue Apr 23 16:37:31 2024     
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4    |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf          Pwr:Usage/Cap |          Memory-Usage | GPU-Util  Compute M. |
|                                        |                        |              MIG M. |
|=========================================+========================+======================|
|  0  GRID V100D-8C                  On  |  00000000:00:06.0 Off |                    0 |
| N/A  N/A    P0            N/A /  N/A  |      0MiB /  8192MiB |      0%      Default |
|                                        |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                       
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU  GI  CI        PID  Type  Process name                              GPU Memory |
|        ID  ID                                                              Usage      |
|=========================================================================================|
|  No running processes found                                                            |
+-----------------------------------------------------------------------------------------+
</pre>
-->
 
== Preparation of a VM running Almalinux8 == <!--T:76-->
 
<!--T:77-->
Once the VM is available, make sure to update the OS to the latest available software, including the kernel. Then, reboot the VM to have the latest kernel running.
To have access to the [https://en.wikipedia.org/wiki/Dynamic_Kernel_Module_Support DKMS package], the EPEL repository is required.
<pre>
<pre>
[root@vgpu almalinux]# dnf -y update && dnf -y install epel-release && reboot
[root@vgpu almalinux]# dnf -y update && dnf -y install epel-release && reboot
</pre>
</pre>
<translate>


<!--T:78-->
After the reboot of the VM, the Arbutus vGPU Cloud repository needs to be installed.
After the reboot of the VM, the [http://repo.arbutus.cloud.computecanada.ca/pulp/repos/alma8/Packages/a/arbutus-cloud-vgpu-repo-1.0-1.el8.noarch.rpm]
needs to be installed.


<!--T:79-->
</translate>
<pre>
<pre>
[root@almalinux8]# dnf install http://repo.arbutus.cloud.computecanada.ca/pulp/repos/alma8/Packages/a/arbutus-cloud-vgpu-repo-1.0-1.el8.noarch.rpm
[root@almalinux8]# dnf install http://repo.arbutus.cloud.computecanada.ca/pulp/repos/alma8/Packages/a/arbutus-cloud-vgpu-repo-1.0-1.el8.noarch.rpm
</pre>
</pre>
<translate>


<!--T:80-->
The next step is to install the vGPU packages, which will install the required driver and user-space tools.
The next step is to install the vGPU packages, which will install the required driver and user-space tools.
 
</translate>
<!--T:81-->
<pre>
<pre>
[root@vgpu almalinux]# dnf -y install nvidia-vgpu-gridd.x86_64 nvidia-vgpu-tools.x86_64 nvidia-vgpu-kmod.x86_64
[root@vgpu almalinux]# dnf -y install nvidia-vgpu-gridd.x86_64 nvidia-vgpu-tools.x86_64 nvidia-vgpu-kmod.x86_64
</pre>
</pre>
<translate>


<!--T:82-->
After a successful  installation, <code>nvidia-smi</code> can be used to verify the proper functionality.
After a successful  installation, <b>nvidia-smi</b> can be used to verify the proper functionality.


<!--T:83-->
</translate>
<pre>
<pre>
[root@almalinux8]# nvidia-smi  
[root@almalinux8]# nvidia-smi  
Line 187: Line 130:
+-----------------------------------------------------------------------------------------+
+-----------------------------------------------------------------------------------------+
</pre>
</pre>
<translate>


== Preparation of a VM running Debian11 == <!--T:64-->
== Preparation of a VM running Debian 11 ==
Ensure that the latest packages are installed and the system has been booted with the latest stable kernel, as <b>DKMS</b> will request the latest one available from the Debian repositories.
Ensure that the latest packages are installed and the system has been booted with the latest stable kernel, as <b>DKMS</b> will request the latest one available from the Debian repositories.


<!--T:65-->
</translate>
<pre>
<pre>
root@debian11:~# apt-get update && apt-get -y dist-upgrade && reboot
root@debian11:~# apt-get update && apt-get -y dist-upgrade && reboot
</pre>  
</pre>
<translate>


<!--T:66-->
After a successful reboot, the system should have the latest available kernel running and the repository can be installed, by installing the <code>arbutus-cloud-repo</code> package.
After a successful reboot, the system should have the latest available kernel running and the repository can be installed, by installing the <code>arbutus-cloud-repo</code> package.
This package also contains the gpg key all packages are signed with.
This package also contains the gpg key all packages are signed with.


<!--T:67-->
</translate>
<pre>
<pre>
root@debian11:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/deb11/pool/main/arbutus-cloud-repo_0.1_all.deb
root@debian11:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/deb11/pool/main/arbutus-cloud-repo_0.1_all.deb
root@debian11:~# apt-get install -y ./arbutus-cloud-repo_0.1_all.deb
root@debian11:~# apt-get install -y ./arbutus-cloud-repo_0.1_all.deb
</pre>
</pre>
<translate>


<!--T:68-->
Update the local apt cache and install the vGPU packages:
Update the local apt cache and install the vGPU packages:


<!--T:69-->
</translate>
<pre>
<pre>
root@debian11:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
root@debian11:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
Line 237: Line 181:
+-----------------------------------------------------------------------------------------+
+-----------------------------------------------------------------------------------------+
</pre>
</pre>
<translate>


== Preparation of a VM running Debian12 == <!--T:64-->
== Preparation of a VM running Debian 12 ==  
Ensure that the latest packages are installed and the system has been booted with the latest stable kernel, as <b>DKMS</b> will request the latest one available from the Debian repositories.
Ensure that the latest packages are installed and the system has been booted with the latest stable kernel, as <b>DKMS</b> will request the latest one available from the Debian repositories.


<!--T:65-->
</translate>
<pre>
<pre>
root@debian12:~# apt-get update && apt-get -y dist-upgrade && reboot
root@debian12:~# apt-get update && apt-get -y dist-upgrade && reboot
</pre>  
</pre>  
<translate>


<!--T:66-->
After a successful reboot, the system should have the latest available kernel running and the repository can be installed, by installing the <code>arbutus-cloud-repo</code> package.
After a successful reboot, the system should have the latest available kernel running and the repository can be installed, by installing the <code>arbutus-cloud-repo</code> package.
This package also contains the gpg key all packages are signed with.
This package also contains the gpg key all packages are signed with.


<!--T:67-->
</translate>
<pre>
<pre>
root@debian12:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/deb12/pool/main/arbutus-cloud-repo_0.1+deb12_all.deb
root@debian12:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/deb12/pool/main/arbutus-cloud-repo_0.1+deb12_all.deb
root@debian12:~# apt-get install -y ./arbutus-cloud-repo_0.1+deb12_all.deb
root@debian12:~# apt-get install -y ./arbutus-cloud-repo_0.1+deb12_all.deb
</pre>
</pre>
<translate>


<!--T:68-->
Update the local apt cache and install the vGPU packages:
Update the local apt cache and install the vGPU packages:


<!--T:69-->
</translate>
<pre>
<pre>
root@debian12:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
root@debian12:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
Line 287: Line 232:
+-----------------------------------------------------------------------------------------+
+-----------------------------------------------------------------------------------------+
</pre>
</pre>
<translate>


== Preparation of a VM running Ubuntu22 == <!--T:8-->
== Preparation of a VM running Ubuntu 22 ==  
Ensure that the OS is up to date, that all the latest patches are installed, and that the latest stable kernel is running.
Ensure that the OS is up to date, that all the latest patches are installed, and that the latest stable kernel is running.


<!--T:48-->
</translate>
<pre>
<pre>
root@ubuntu22:~# apt-get update && apt-get -y dist-upgrade && reboot
root@ubuntu22:~# apt-get update && apt-get -y dist-upgrade && reboot
</pre>
</pre>
<translate>


<!--T:49-->
After a successful reboot, the system should have the latest available kernel running.  
After a successful reboot, the system should have the latest available kernel running.  
Now the repository can be installed by installing the <code>arbutus-cloud-repo</code> package.
Now the repository can be installed by installing the <code>arbutus-cloud-repo</code> package.
This package also contains the gpg key all packages are signed with.
This package also contains the gpg key all packages are signed with.


<!--T:50-->
</translate>
<pre>
<pre>
root@ubuntu22:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/ubnt22/pool/main/arbutus-cloud-repo_0.1_all.deb
root@ubuntu22:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/ubnt22/pool/main/arbutus-cloud-repo_0.1_all.deb
root@ubuntu22:~# apt-get install ./arbutus-cloud-repo_0.1_all.deb
root@ubuntu22:~# apt-get install ./arbutus-cloud-repo_0.1_all.deb
</pre>
</pre>
<translate>


<!--T:51-->
Update the local apt cache and install the vGPU packages:
Update the local apt cache and install the vGPU packages:
</translate>
<pre>
<pre>
root@ubuntu22:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
root@ubuntu22:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
</pre>
</pre>
<translate>


<!--T:52-->
If your installation was successful, the vGPU will be accessible and licensed.
If your installation was successful, the vGPU will be accessible and licensed.


</translate>
<pre>
<pre>
root@ubuntu22:~# nvidia-smi  
root@ubuntu22:~# nvidia-smi  
Line 339: Line 287:
+-----------------------------------------------------------------------------------------+
+-----------------------------------------------------------------------------------------+
</pre>
</pre>
<translate>


== Preparation of a VM running Ubuntu20 == <!--T:8-->
== Preparation of a VM running Ubuntu 20 ==
Ensure that the OS is up to date, that all the latest patches are installed, and that the latest stable kernel is running.
Ensure that the OS is up to date, that all the latest patches are installed, and that the latest stable kernel is running.


<!--T:48-->
</translate>
<pre>
<pre>
root@ubuntu20:~# apt-get update && apt-get -y dist-upgrade && reboot
root@ubuntu20:~# apt-get update && apt-get -y dist-upgrade && reboot
</pre>
</pre>
<translate>


<!--T:49-->
After a successful reboot, the system should have the latest available kernel running.  
After a successful reboot, the system should have the latest available kernel running.  
Now the repository can be installed by installing the <code>arbutus-cloud-repo</code> package.
Now the repository can be installed by installing the <code>arbutus-cloud-repo</code> package.
This package also contains the gpg key all packages are signed with.
This package also contains the gpg key all packages are signed with.


<!--T:50-->
</translate>
<pre>
<pre>
root@ubuntu20:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/ubnt20/pool/main/arbutus-cloud-repo_0.1ubuntu20_all.deb
root@ubuntu20:~# wget http://repo.arbutus.cloud.computecanada.ca/pulp/deb/ubnt20/pool/main/arbutus-cloud-repo_0.1ubuntu20_all.deb
root@ubuntu20:~# apt-get install ./arbutus-cloud-repo_0.1ubuntu20_all.deb
root@ubuntu20:~# apt-get install ./arbutus-cloud-repo_0.1ubuntu20_all.deb
</pre>
</pre>
<translate>


<!--T:51-->
Update the local apt cache and install the vGPU packages:
Update the local apt cache and install the vGPU packages:
</translate>
<pre>
<pre>
root@ubuntu20:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
root@ubuntu20:~# apt-get update && apt-get -y install nvidia-vgpu-kmod nvidia-vgpu-tools nvidia-vgpu-gridd
</pre>
</pre>
<translate>


<!--T:52-->
If your installation was successful, the vGPU will be accessible and licensed.
If your installation was successful, the vGPU will be accessible and licensed.


</translate>
<pre>
<pre>
root@ubuntu20:~# nvidia-smi  
root@ubuntu20:~# nvidia-smi  
Line 391: Line 342:
+-----------------------------------------------------------------------------------------+
+-----------------------------------------------------------------------------------------+
</pre>
</pre>
</translate>
 
[[Category:Cloud]]
[[Category:Cloud]]
Bureaucrats, cc_docs_admin, cc_staff
2,879

edits

Navigation menu