PyTorch: Difference between revisions
No edit summary |
(Marked this version for translation) |
||
Line 115: | Line 115: | ||
On AVX512 hardware (Béluga, Skylake or V100 nodes), older versions of Pytorch (less than v1.0.1) using older libraries (cuDNN < v7.5 or MAGMA < v2.5) may considerably leak memory resulting in an out-of-memory exception and death of your tasks. Please upgrade to the latest <tt>torch</tt> version. | On AVX512 hardware (Béluga, Skylake or V100 nodes), older versions of Pytorch (less than v1.0.1) using older libraries (cuDNN < v7.5 or MAGMA < v2.5) may considerably leak memory resulting in an out-of-memory exception and death of your tasks. Please upgrade to the latest <tt>torch</tt> version. | ||
= LibTorch = | = LibTorch = <!--T:38--> | ||
LibTorch allows one to implement both C++ extensions to PyTorch and '''pure C++ machine learning applications'''. It contains "all headers, libraries and CMake configuration files required to depend on PyTorch" (as mentioned in the [https://pytorch.org/cppdocs/installing.html docs]). | LibTorch allows one to implement both C++ extensions to PyTorch and '''pure C++ machine learning applications'''. It contains "all headers, libraries and CMake configuration files required to depend on PyTorch" (as mentioned in the [https://pytorch.org/cppdocs/installing.html docs]). | ||
=== How to use LibTorch === | === How to use LibTorch === <!--T:39--> | ||
==== Get the library ==== | ==== Get the library ==== <!--T:40--> | ||
<!--T:41--> | |||
<syntaxhighlight> | <syntaxhighlight> | ||
wget https://download.pytorch.org/libtorch/cu100/libtorch-shared-with-deps-latest.zip | wget https://download.pytorch.org/libtorch/cu100/libtorch-shared-with-deps-latest.zip | ||
Line 129: | Line 130: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
<!--T:42--> | |||
Patch the library (this workaround is needed for compiling on Compute Canada clusters): | Patch the library (this workaround is needed for compiling on Compute Canada clusters): | ||
<syntaxhighlight> | <syntaxhighlight> | ||
Line 134: | Line 136: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
<!--T:43--> | |||
The library is also included in the PyTorch wheel, but this is not the recommended way to acquire it. Once Pytorch is installed in a virtual environment, you can find it at: <tt>$VIRTUAL_ENV/lib/python3.6/site-packages/torch/lib/libtorch.so</tt>. | The library is also included in the PyTorch wheel, but this is not the recommended way to acquire it. Once Pytorch is installed in a virtual environment, you can find it at: <tt>$VIRTUAL_ENV/lib/python3.6/site-packages/torch/lib/libtorch.so</tt>. | ||
==== Compile a minimal example ==== | ==== Compile a minimal example ==== <!--T:44--> | ||
<!--T:45--> | |||
Create the following two files: | Create the following two files: | ||
<!--T:46--> | |||
{{File | {{File | ||
|name=example-app.cpp | |name=example-app.cpp | ||
Line 147: | Line 152: | ||
#include <iostream> | #include <iostream> | ||
<!--T:47--> | |||
int main() { | int main() { | ||
torch::Device device(torch::kCPU); | torch::Device device(torch::kCPU); | ||
Line 154: | Line 160: | ||
} | } | ||
torch::Tensor tensor = torch::rand({2, 3}).to(device); | <!--T:48--> | ||
torch::Tensor tensor = torch::rand({2, 3}).to(device); | |||
std::cout << tensor << std::endl; | std::cout << tensor << std::endl; | ||
} | } | ||
}} | }} | ||
<!--T:49--> | |||
{{File | {{File | ||
|name=CMakeLists.txt | |name=CMakeLists.txt | ||
Line 166: | Line 174: | ||
project(example-app) | project(example-app) | ||
<!--T:50--> | |||
find_package(Torch REQUIRED) | find_package(Torch REQUIRED) | ||
<!--T:51--> | |||
add_executable(example-app example-app.cpp) | add_executable(example-app example-app.cpp) | ||
target_link_libraries(example-app "${TORCH_LIBRARIES}") | target_link_libraries(example-app "${TORCH_LIBRARIES}") | ||
Line 173: | Line 183: | ||
}} | }} | ||
<!--T:52--> | |||
Load the necessary modules: | Load the necessary modules: | ||
<!--T:53--> | |||
<syntaxhighlight> | <syntaxhighlight> | ||
module load cmake intel/2018.3 cuda/10 cudnn | module load cmake intel/2018.3 cuda/10 cudnn | ||
</syntaxhighlight> | </syntaxhighlight> | ||
<!--T:54--> | |||
Compile the program: | Compile the program: | ||
<!--T:55--> | |||
<syntaxhighlight> | <syntaxhighlight> | ||
mkdir build | mkdir build | ||
Line 188: | Line 202: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
<!--T:56--> | |||
Run the program: | Run the program: | ||
<!--T:57--> | |||
<syntaxhighlight> | <syntaxhighlight> | ||
./example-app | ./example-app | ||
</syntaxhighlight> | </syntaxhighlight> | ||
<!--T:58--> | |||
To test an application with CUDA, request an [[Running_jobs#Interactive_jobs|interactive job]] with a [[Using_GPUs_with_Slurm|GPU]]. | To test an application with CUDA, request an [[Running_jobs#Interactive_jobs|interactive job]] with a [[Using_GPUs_with_Slurm|GPU]]. | ||
=== Resources === | === Resources === <!--T:59--> | ||
<!--T:60--> | |||
https://pytorch.org/cppdocs/ | https://pytorch.org/cppdocs/ | ||
</translate> | </translate> |
Revision as of 17:51, 23 July 2019
PyTorch is a Python package that provides two high-level features:
- Tensor computation (like NumPy) with strong GPU acceleration
- Deep neural networks built on a tape-based autograd system
PyTorch has a distant connection with Torch, but for all practical purposes you can treat them as separate projects.
PyTorch developers also offer LibTorch, which allows one to implement extensions to PyTorch using C++, and to implement pure C++ machine learning applications. Models written in Python using PyTorch can be converted and used in pure C++ through TorchScript.
Installation
Latest available wheels
To see the latest version of PyTorch that we have built:
[name@server ~]$ avail_wheels "torch*"
For more information on listing wheels, see listing available wheels.
Installing Compute Canada wheel
The preferred option is to install it using the Python wheel as follows:
- 1. Load a Python module, either python/2.7, python/3.5, python/3.6 or python/3.7
- 2. Create and start a virtual environment.
- 3. Install PyTorch in the virtual environment with
pip install
.
GPU and CPU
-
(venv) [name@server ~] pip install torch --no-index
Extra
In addition to torch, you can install torchvision, torchtext and torchaudio:
(venv) [name@server ~] pip install torch torchvision torchtext torchaudio --no-index
Job submission
Here is an example of a job submission script using the python wheel, with a virtual environment inside a job:
#!/bin/bash
#SBATCH --gres=gpu:1 # Request GPU "generic resources"
#SBATCH --cpus-per-task=6 # Cores proportional to GPUs: 6 on Cedar, 16 on Graham.
#SBATCH --mem=32000M # Memory proportional to GPUs: 32000 Cedar, 64000 Graham.
#SBATCH --time=0-03:00
#SBATCH --output=%N-%j.out
module load python/3.6
virtualenv --no-download $SLURM_TMPDIR/env
source $SLURM_TMPDIR/env/bin/activate
pip install torch --no-index
python pytorch-test.py
The Python script pytorch-test.py
has the form
import torch
x = torch.Tensor(5, 3)
print(x)
y = torch.rand(5, 3)
print(y)
# let us run the following only if CUDA is available
if torch.cuda.is_available():
x = x.cuda()
y = y.cuda()
print(x + y)
You can then submit a PyTorch job with:
[name@server ~]$ sbatch pytorch-test.sh
Troubleshooting
Memory leak
On AVX512 hardware (Béluga, Skylake or V100 nodes), older versions of Pytorch (less than v1.0.1) using older libraries (cuDNN < v7.5 or MAGMA < v2.5) may considerably leak memory resulting in an out-of-memory exception and death of your tasks. Please upgrade to the latest torch version.
LibTorch
LibTorch allows one to implement both C++ extensions to PyTorch and pure C++ machine learning applications. It contains "all headers, libraries and CMake configuration files required to depend on PyTorch" (as mentioned in the docs).
How to use LibTorch
Get the library
wget https://download.pytorch.org/libtorch/cu100/libtorch-shared-with-deps-latest.zip
unzip libtorch-shared-with-deps-latest.zip
cd libtorch
export LIBTORCH_ROOT=$(pwd) # this variable is used in the example below
Patch the library (this workaround is needed for compiling on Compute Canada clusters):
sed -i -e 's/\/usr\/local\/cuda\/lib64\/libculibos.a;dl;\/usr\/local\/cuda\/lib64\/libculibos.a;//g' share/cmake/Caffe2/Caffe2Targets.cmake
The library is also included in the PyTorch wheel, but this is not the recommended way to acquire it. Once Pytorch is installed in a virtual environment, you can find it at: $VIRTUAL_ENV/lib/python3.6/site-packages/torch/lib/libtorch.so.
Compile a minimal example
Create the following two files:
#include <torch/torch.h>
#include <iostream>
int main() {
torch::Device device(torch::kCPU);
if (torch::cuda::is_available()) {
std::cout << "CUDA is available! Using GPU." << std::endl;
device = torch::Device(torch::kCUDA);
}
torch::Tensor tensor = torch::rand({2, 3}).to(device);
std::cout << tensor << std::endl;
}
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(example-app)
find_package(Torch REQUIRED)
add_executable(example-app example-app.cpp)
target_link_libraries(example-app "${TORCH_LIBRARIES}")
set_property(TARGET example-app PROPERTY CXX_STANDARD 11)
Load the necessary modules:
module load cmake intel/2018.3 cuda/10 cudnn
Compile the program:
mkdir build
cd build
cmake -DCMAKE_PREFIX_PATH="$LIBTORCH_ROOT;$EBROOTCUDA;$EBROOTCUDNN" ..
make
Run the program:
./example-app
To test an application with CUDA, request an interactive job with a GPU.