Using cloud GPUs: Difference between revisions

copy-editing
m (Use HTTPS to install repo)
(copy-editing)
Line 2: Line 2:
<languages />
<languages />
<translate>
<translate>
== How to use GPU in cloud VMs == <!--T:1-->


<!--T:2-->
<!--T:2-->
This howto describes the steps needed to allocate GPU resources to a virtual machine (VM), installing the necessary drivers as well as simple steps to on what to check to see that the GPU is properly allocated and cannow be used.
This guide describes how to allocate GPU resources to a virtual machine (VM), installing the necessary drivers and checking whether the GPU can be used.


<!--T:3-->
<!--T:3-->
To use a GPU within a VM, the instance needs to be deployed with on for the flavors listed below, to make the GPU available to the Operating System via the PCI bus.
To use a GPU within a VM, the instance needs to be deployed on one of the flavors listed below.  The GPU will be available to the operating system via the PCI bus.


<!--T:4-->
<!--T:4-->
Line 15: Line 14:
* g1-c14-56gb
* g1-c14-56gb


== Preparing a Debian 10 Instance == <!--T:5-->
== Preparing a Debian 10 instance == <!--T:5-->
To use the GPU via the PCI bus, the proprietary Nvidia drivers are required. Due to Debian's policy, the drivers are available from the non-free pool only.


===== <u>Enabling the non-free pool</u> ===== <!--T:6-->
To use the GPU via the PCI bus, the proprietary NVIDIA drivers are required. Due to Debian's policy, the drivers are available from the non-free pool only.
Log in via ssh and add the sources below to ''/etc/apt/sources.list'', if not already in there.
 
===== Enable the non-free pool ===== <!--T:6-->
 
Log in using ssh and add the lines below to ''/etc/apt/sources.list'', if they are not already there.


<!--T:7-->
<!--T:7-->
Line 28: Line 29:
</pre>
</pre>


===== <u>Installing the Nvidia Driver</u> ===== <!--T:8-->
===== Install the NVIDIA driver ===== <!--T:8-->
The following command will update the apt cache, so that apt will be aware of the new software pool sections, runs an upgrade to update the OS to the latest software versions and installs the kernel headers, the nvidia-driver and the pciutils, which will be required to list the devices connected to the PCI bus.  
 
The following command:
* updates the <code>apt</code> cache, so that <code>apt</code> will be aware of the new software pool sections,
* updates the OS to the latest software versions, and  
* installs kernel headers, an NVIDIA driver, and <code>pciutils</code>, which will be required to list the devices connected to the PCI bus.  


<!--T:9-->
<!--T:9-->
Line 35: Line 40:
root@gpu2:~# apt-get update && apt-get -y dist-upgrade && apt-get -y install pciutils linux-headers-`uname -r` linux-headers-amd64 nvidia-driver
root@gpu2:~# apt-get update && apt-get -y dist-upgrade && apt-get -y install pciutils linux-headers-`uname -r` linux-headers-amd64 nvidia-driver
</pre>
</pre>
After the installation has finished and the nvidia has been automatically compiled and loaded, the following steps can be used to verify that everything has been prepared to launch the ''nvidia-persistenced'', which will create the device files and makes the GPU accessible to the user space.
 
If this command finishes successfully, the NVIDIA driver will have been compiled and loaded.


<!--T:10-->
<!--T:10-->
Line 58: Line 64:


<!--T:11-->
<!--T:11-->
* Check that the nvidia kernel module is loaded
* Check that the <code>nvidia</code> kernel module is loaded
<pre>
<pre>
root@gpu2:~# lsmod | grep nvidia
root@gpu2:~# lsmod | grep nvidia
Line 66: Line 72:


<!--T:12-->
<!--T:12-->
Now the userspace process can be started, which will create the necessary character device files.
* Start <code>nvidia-persistenced</code>, which will create the necessary device files and make the GPU accessible in user space.
<pre>
<pre>
root@gpu2:~# service nvidia-persistenced  restart
root@gpu2:~# service nvidia-persistenced  restart
<!--T:13-->
root@gpu2:~# ls -al /dev/nvidia*
root@gpu2:~# ls -al /dev/nvidia*
crw-rw-rw- 1 root root 195,  0 Mar  6 18:55 /dev/nvidia0
crw-rw-rw- 1 root root 195,  0 Mar  6 18:55 /dev/nvidia0
Line 80: Line 84:
The GPU is now available within the user space and can be used.
The GPU is now available within the user space and can be used.


== Preparing a CentOS 7 Instance == <!--T:15-->
== Preparing a CentOS 7 instance == <!--T:15-->
Nvidia provides repositories for various distributions, therefore the required software can be installed and maintained via these repositories.
 
NVIDIA provides repositories for various distributions, therefore the required software can be installed and maintained via these repositories.


<!--T:16-->
<!--T:16-->
To compile the module sources from the nvidia repository, it is necessary to install dkms to automatically build the modules on kernel updates.
To compile the module sources from the NVIDIA repository, it is necessary to install <code>dkms</code>.
It ensures that the GPU is still working after OS updates, dkms is provided in the EPEL repository and additionally the kernel headers and the kernel source needs to be installed
This will automatically build the modules on kernel updates, and therefore ensures that the GPU is still working after any update of the OS.
before the nvidia driver can be set up.
<code>dkms</code> is provided in the EPEL repository.
Kernel headers and the kernel source need to be installed before the NVIDIA driver can be set up.
 
===== Enable the EPEL repository and install needed software ===== <!--T:17-->


===== <u>Enabling the EPEL repository and install needed software</u> ===== <!--T:17-->
<pre>
<pre>
[root@gpu-centos centos]# yum -y update && reboot
[root@gpu-centos centos]# yum -y update && reboot
Line 94: Line 101:
</pre>
</pre>


===== <u>Adding the NVIDIA repository and install the driver package</u> ===== <!--T:18-->
===== Add the NVIDIA repository and install the driver package ===== <!--T:18-->
Install the YUM repository:
 
Install the <code>yum</code> repository:
<pre>
<pre>
[root@gpu-centos centos]# yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo
[root@gpu-centos centos]# yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo
Line 102: Line 110:


<!--T:19-->
<!--T:19-->
Nvidia uses its own gpg key to sign its packages, yum will ask to autoimport it.
NVIDIA uses its own GPG key to sign its packages.  <code>yum</code> will ask to autoimport it. Reply "y" for "yes" when prompted.
 
<!--T:20-->
<pre>
<pre>
Retrieving key from http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/7fa2af80.pub
Retrieving key from http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/7fa2af80.pub
Line 115: Line 121:


<!--T:21-->
<!--T:21-->
After the installation a reboot is required to properly load the module and create the nvidia device files.
After installation, reboot the VM to properly load the module and create the NVIDIA device files.
<pre>
<pre>
[root@gpu-centos ~]# ls -al /dev/nvidia*
[root@gpu-centos ~]# ls -al /dev/nvidia*
Bureaucrats, cc_docs_admin, cc_staff
2,879

edits