Using cloud vGPUs: Difference between revisions

Undo revision 154095 by Bott (talk)
(Undo revision 154103 by Bott (talk))
Tag: Undo
(Undo revision 154095 by Bott (talk))
Tag: Undo
Line 82: Line 82:
-->
-->


== Preparation of a VM running CentOS7 == <!--T:5-->
<!--T:24-->
Once the VM is available, make sure to update the OS to the latest available software, including the kernel. Then reboot the VM to have the latest kernel running.
<pre>
[root@centos7]# yum -y update && reboot
</pre>
<!--T:25-->
Since the  proprietary NVIDIA drivers need to be compiled against the running kernel, the [https://en.wikipedia.org/wiki/Dynamic_Kernel_Module_Support DKMS package] is required from the [https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm EPEL repository].
<!--T:26-->
<pre>
[root@centos7]# yum -y install epel-release
</pre>
<!--T:27-->
Install the <b>Arbutus Cloud</b> [http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/arbutus-cloud-vgpu-repo.el7.noarch.rpm repository].
This also installs the public key the packages are signed with, to ensure their authenticity.
These drivers and user-space tools are carefully tested against the infrastructure before they are made available.
<pre>
[root@centos7]# yum -y install http://repo.arbutus.cloud.computecanada.ca/pulp/repos/centos/7/x86_64/Packages/a/arbutus-cloud-vgpu-repo-1.1-1.el7.noarch.rpm
</pre>
<!--T:28-->
The last step is to install the <b>NVIDIA vGPU packages</b>.
The kernel module package <code>nvidia-vgpu-kmod</code> will take a few minutes as it compiles the required kernel modules in the background.
<pre>
[root@centos7]# yum -y install nvidia-vgpu-kmod nvidia-vgpu-gridd nvidia-vgpu-tools
</pre>
<!--T:29-->
If your installation was successful, the vGPU will be accessible and licensed.
Test this by running <code>nvidia-smi</code>:
<pre>
[root@centos7]# nvidia-smi       
Tue Sep 21 17:40:33 2021     
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03    Driver Version: 460.91.03    CUDA Version: 11.2    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|        Memory-Usage | GPU-Util  Compute M. |
|                              |                      |              MIG M. |
|===============================+======================+======================|
|  0  GRID V100D-8C      On  | 00000000:00:05.0 Off |                  N/A |
| N/A  N/A    P0    N/A /  N/A |    560MiB /  8192MiB |      0%      Default |
|                              |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                             
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU  GI  CI        PID  Type  Process name                  GPU Memory |
|        ID  ID                                                  Usage      |
|=============================================================================|
|  No running processes found                                                |
+-----------------------------------------------------------------------------+
<!--T:53-->
</pre>
<!--T:30-->
To check for the license status as well as other information about the vGPU:
<pre>
[root@centos7]# nvidia-smi -q |less
==============NVSMI LOG============== <!--T:54-->
<!--T:55-->
Timestamp                                : Tue Sep 21 17:41:48 2021
Driver Version                            : 460.91.03
CUDA Version                              : 11.2
<!--T:56-->
Attached GPUs                            : 1
GPU 00000000:00:05.0
    Product Name                          : GRID V100D-8C
    Product Brand                        : NVIDIA Virtual Compute Server
    Display Mode                          : Enabled
    Display Active                        : Disabled
    Persistence Mode                      : Enabled
    MIG Mode
        Current                          : N/A
        Pending                          : N/A
    Accounting Mode                      : Disabled
    Accounting Mode Buffer Size          : 4000
    Driver Model
        Current                          : N/A
        Pending                          : N/A
    Serial Number                        : N/A
    GPU UUID                              : GPU-c6d5d6c1-1b00-11ec-b031-a89a79e5169c
    Minor Number                          : 0
    VBIOS Version                        : 00.00.00.00.00
    MultiGPU Board                        : No
    Board ID                              : 0x5
    GPU Part Number                      : N/A
    Inforom Version
        Image Version                    : N/A
        OEM Object                        : N/A
        ECC Object                        : N/A
        Power Management Object          : N/A
    GPU Operation Mode
        Current                          : N/A
        Pending                          : N/A
    GPU Virtualization Mode
        Virtualization Mode              : VGPU
        Host VGPU Mode                    : N/A
    vGPU Software Licensed Product
        Product Name                      : NVIDIA Virtual Compute Server
        License Status                    : Licensed
    IBMNPU
        Relaxed Ordering Mode            : N/A
    PCI
        Bus                              : 0x00
        Device                            : 0x05
        Domain                            : 0x0000
        Device Id                        : 0x1DB610DE
        Bus Id                            : 00000000:00:05.0
</pre>
<!-- ============ UPDATE PREPARATION DO NOT PUBLISH YET! ========================== -->
<!--
== Preparation of a VM running Almalinux8 == <!--T:76-->  
== Preparation of a VM running Almalinux8 == <!--T:76-->  


Line 135: Line 256:
+-----------------------------------------------------------------------------------------+
+-----------------------------------------------------------------------------------------+
</pre>
</pre>
-->


== Preparation of a VM running Almalinux9 == <!--T:76-->  
== Preparation of a VM running Almalinux9 == <!--T:76-->  
cc_staff
247

edits