Using cloud GPUs: Difference between revisions
m (Use HTTPS to install repo) |
(copy-editing) |
||
Line 2: | Line 2: | ||
<languages /> | <languages /> | ||
<translate> | <translate> | ||
<!--T:2--> | <!--T:2--> | ||
This | This guide describes how to allocate GPU resources to a virtual machine (VM), installing the necessary drivers and checking whether the GPU can be used. | ||
<!--T:3--> | <!--T:3--> | ||
To use a GPU within a VM, the instance needs to be deployed | To use a GPU within a VM, the instance needs to be deployed on one of the flavors listed below. The GPU will be available to the operating system via the PCI bus. | ||
<!--T:4--> | <!--T:4--> | ||
Line 15: | Line 14: | ||
* g1-c14-56gb | * g1-c14-56gb | ||
== Preparing a Debian 10 | == Preparing a Debian 10 instance == <!--T:5--> | ||
===== | To use the GPU via the PCI bus, the proprietary NVIDIA drivers are required. Due to Debian's policy, the drivers are available from the non-free pool only. | ||
Log in | |||
===== Enable the non-free pool ===== <!--T:6--> | |||
Log in using ssh and add the lines below to ''/etc/apt/sources.list'', if they are not already there. | |||
<!--T:7--> | <!--T:7--> | ||
Line 28: | Line 29: | ||
</pre> | </pre> | ||
===== | ===== Install the NVIDIA driver ===== <!--T:8--> | ||
The following command | |||
The following command: | |||
* updates the <code>apt</code> cache, so that <code>apt</code> will be aware of the new software pool sections, | |||
* updates the OS to the latest software versions, and | |||
* installs kernel headers, an NVIDIA driver, and <code>pciutils</code>, which will be required to list the devices connected to the PCI bus. | |||
<!--T:9--> | <!--T:9--> | ||
Line 35: | Line 40: | ||
root@gpu2:~# apt-get update && apt-get -y dist-upgrade && apt-get -y install pciutils linux-headers-`uname -r` linux-headers-amd64 nvidia-driver | root@gpu2:~# apt-get update && apt-get -y dist-upgrade && apt-get -y install pciutils linux-headers-`uname -r` linux-headers-amd64 nvidia-driver | ||
</pre> | </pre> | ||
If this command finishes successfully, the NVIDIA driver will have been compiled and loaded. | |||
<!--T:10--> | <!--T:10--> | ||
Line 58: | Line 64: | ||
<!--T:11--> | <!--T:11--> | ||
* Check that the nvidia kernel module is loaded | * Check that the <code>nvidia</code> kernel module is loaded | ||
<pre> | <pre> | ||
root@gpu2:~# lsmod | grep nvidia | root@gpu2:~# lsmod | grep nvidia | ||
Line 66: | Line 72: | ||
<!--T:12--> | <!--T:12--> | ||
* Start <code>nvidia-persistenced</code>, which will create the necessary device files and make the GPU accessible in user space. | |||
<pre> | <pre> | ||
root@gpu2:~# service nvidia-persistenced restart | root@gpu2:~# service nvidia-persistenced restart | ||
root@gpu2:~# ls -al /dev/nvidia* | root@gpu2:~# ls -al /dev/nvidia* | ||
crw-rw-rw- 1 root root 195, 0 Mar 6 18:55 /dev/nvidia0 | crw-rw-rw- 1 root root 195, 0 Mar 6 18:55 /dev/nvidia0 | ||
Line 80: | Line 84: | ||
The GPU is now available within the user space and can be used. | The GPU is now available within the user space and can be used. | ||
== Preparing a CentOS 7 | == Preparing a CentOS 7 instance == <!--T:15--> | ||
NVIDIA provides repositories for various distributions, therefore the required software can be installed and maintained via these repositories. | |||
<!--T:16--> | <!--T:16--> | ||
To compile the module sources from the | To compile the module sources from the NVIDIA repository, it is necessary to install <code>dkms</code>. | ||
This will automatically build the modules on kernel updates, and therefore ensures that the GPU is still working after any update of the OS. | |||
before the | <code>dkms</code> is provided in the EPEL repository. | ||
Kernel headers and the kernel source need to be installed before the NVIDIA driver can be set up. | |||
===== Enable the EPEL repository and install needed software ===== <!--T:17--> | |||
<pre> | <pre> | ||
[root@gpu-centos centos]# yum -y update && reboot | [root@gpu-centos centos]# yum -y update && reboot | ||
Line 94: | Line 101: | ||
</pre> | </pre> | ||
===== | ===== Add the NVIDIA repository and install the driver package ===== <!--T:18--> | ||
Install the | |||
Install the <code>yum</code> repository: | |||
<pre> | <pre> | ||
[root@gpu-centos centos]# yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo | [root@gpu-centos centos]# yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo | ||
Line 102: | Line 110: | ||
<!--T:19--> | <!--T:19--> | ||
NVIDIA uses its own GPG key to sign its packages. <code>yum</code> will ask to autoimport it. Reply "y" for "yes" when prompted. | |||
<pre> | <pre> | ||
Retrieving key from http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/7fa2af80.pub | Retrieving key from http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/7fa2af80.pub | ||
Line 115: | Line 121: | ||
<!--T:21--> | <!--T:21--> | ||
After | After installation, reboot the VM to properly load the module and create the NVIDIA device files. | ||
<pre> | <pre> | ||
[root@gpu-centos ~]# ls -al /dev/nvidia* | [root@gpu-centos ~]# ls -al /dev/nvidia* |
Revision as of 12:33, 17 April 2020
This guide describes how to allocate GPU resources to a virtual machine (VM), installing the necessary drivers and checking whether the GPU can be used.
To use a GPU within a VM, the instance needs to be deployed on one of the flavors listed below. The GPU will be available to the operating system via the PCI bus.
- g2-c24-112gb-500
- g1-c14-56gb-500
- g1-c14-56gb
Preparing a Debian 10 instance
To use the GPU via the PCI bus, the proprietary NVIDIA drivers are required. Due to Debian's policy, the drivers are available from the non-free pool only.
Enable the non-free pool
Log in using ssh and add the lines below to /etc/apt/sources.list, if they are not already there.
deb http://deb.debian.org/debian buster main contrib non-free deb http://security.debian.org/ buster/updates main contrib non-free deb http://deb.debian.org/debian buster-updates main contrib non-free
Install the NVIDIA driver
The following command:
- updates the
apt
cache, so thatapt
will be aware of the new software pool sections, - updates the OS to the latest software versions, and
- installs kernel headers, an NVIDIA driver, and
pciutils
, which will be required to list the devices connected to the PCI bus.
root@gpu2:~# apt-get update && apt-get -y dist-upgrade && apt-get -y install pciutils linux-headers-`uname -r` linux-headers-amd64 nvidia-driver
If this command finishes successfully, the NVIDIA driver will have been compiled and loaded.
- Check if the GPU is exposed on the PCI bus
root@gpu2:~# lspci -vk [...] 00:05.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1) Subsystem: NVIDIA Corporation GK210GL [Tesla K80] Physical Slot: 5 Flags: bus master, fast devsel, latency 0, IRQ 11 Memory at fd000000 (32-bit, non-prefetchable) [size=16M] Memory at 1000000000 (64-bit, prefetchable) [size=16G] Memory at 1400000000 (64-bit, prefetchable) [size=32M] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Kernel driver in use: nvidia Kernel modules: nvidia [...]
- Check that the
nvidia
kernel module is loaded
root@gpu2:~# lsmod | grep nvidia nvidia 17936384 0 nvidia_drm 16384 0
- Start
nvidia-persistenced
, which will create the necessary device files and make the GPU accessible in user space.
root@gpu2:~# service nvidia-persistenced restart root@gpu2:~# ls -al /dev/nvidia* crw-rw-rw- 1 root root 195, 0 Mar 6 18:55 /dev/nvidia0 crw-rw-rw- 1 root root 195, 255 Mar 6 18:55 /dev/nvidiactl crw-rw-rw- 1 root root 195, 254 Mar 6 18:55 /dev/nvidia-modeset
The GPU is now available within the user space and can be used.
Preparing a CentOS 7 instance
NVIDIA provides repositories for various distributions, therefore the required software can be installed and maintained via these repositories.
To compile the module sources from the NVIDIA repository, it is necessary to install dkms
.
This will automatically build the modules on kernel updates, and therefore ensures that the GPU is still working after any update of the OS.
dkms
is provided in the EPEL repository.
Kernel headers and the kernel source need to be installed before the NVIDIA driver can be set up.
Enable the EPEL repository and install needed software
[root@gpu-centos centos]# yum -y update && reboot yum -y install epel-release && yum -y install dkms kernel-devel-$(uname -r) kernel-headers-$(uname -r)
Add the NVIDIA repository and install the driver package
Install the yum
repository:
[root@gpu-centos centos]# yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo yum install -y cuda-drivers
NVIDIA uses its own GPG key to sign its packages. yum
will ask to autoimport it. Reply "y" for "yes" when prompted.
Retrieving key from http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/7fa2af80.pub Importing GPG key 0x7FA2AF80: Userid : "cudatools <cudatools@nvidia.com>" Fingerprint: ae09 fe4b bd22 3a84 b2cc fce3 f60f 4b3d 7fa2 af80 From : http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/7fa2af80.pub Is this ok [y/N]: y
After installation, reboot the VM to properly load the module and create the NVIDIA device files.
[root@gpu-centos ~]# ls -al /dev/nvidia* crw-rw-rw-. 1 root root 195, 0 Mar 10 20:35 /dev/nvidia0 crw-rw-rw-. 1 root root 195, 255 Mar 10 20:35 /dev/nvidiactl crw-rw-rw-. 1 root root 195, 254 Mar 10 20:35 /dev/nvidia-modeset crw-rw-rw-. 1 root root 241, 0 Mar 10 20:35 /dev/nvidia-uvm crw-rw-rw-. 1 root root 241, 1 Mar 10 20:35 /dev/nvidia-uvm-tools
The GPU is now accessible via any user space tool.