RAPIDS: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
No edit summary
 
(74 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{draft}}
<languages />
<languages />
=Overview=


[https://rapids.ai/ RAPIDS] is a suite of open source software libraries from NVIDIA, mainly for executing data science and analytics pipelines on GPUs. It relies on NVIDIA CUDA primitives for low level compute optimization and provides users with friendly Python APIs, similar to those in Pandas, Scikit-learn, etc.
<translate>
=Overview= <!--T:1-->


Since RAPIDS is available as Conda packages which require having [[Anaconda/en|Anaconda]] for the installation, however Anaconda is not advised to use on the Compute Canada clusters. Instead, a container solution of using [[Singularity|Singularity]] is recommended. As RAPIDS is also available as Docker container images from NVIDIA, and a Singularity image for RAPIDS can be built based from a Docker image.
<!--T:2-->
[https://rapids.ai/ RAPIDS] is a suite of open source software libraries from NVIDIA mainly for executing data science and analytics pipelines in Python on GPUs. It relies on NVIDIA CUDA primitives for low-level compute optimization and provides friendly Python APIs, similar to those in Pandas or Scikit-learn.


This page provides the instructions for working with RAPIDS on Compute Canada clusters based from a Singularity container.
<!--T:3-->
The main components are:
* '''cuDF''', a Python GPU DataFrame library (built on the Apache Arrow columnar memory format) for loading, joining, aggregating, filtering, and otherwise manipulating data.


=Build a Singularity image for RAPIDS=
<!--T:38-->
* '''cuML''', a suite of libraries that implement machine learning algorithms and mathematical primitive functions that share compatible APIs with other RAPIDS projects.


To build a Singularity image for RAPIDS the first thing to do is to find a Docker image for RAPIDS.
<!--T:39-->
* '''cuGraph''', a GPU accelerated graph analytics library, with functionality like NetworkX, which is seamlessly integrated into the RAPIDS data science platform.
 
<!--T:40-->
* '''Cyber Log Accelerators (CLX or ''clicks'')''', a collection of RAPIDS examples for security analysts, data scientists, and engineers to quickly get started applying RAPIDS and GPU acceleration to real-world cybersecurity use cases.
 
<!--T:41-->
* '''cuxFilter''', a connector library, which provides the connections between different visualization libraries and a GPU dataframe without much hassle. This also allows you to use charts from different libraries in a single dashboard, while also providing the interaction.
 
<!--T:42-->
* '''cuSpatial''', a GPU accelerated C++/Python library for accelerating GIS workflows including point-in-polygon, spatial join, coordinate systems, shape primitives, distances, and trajectory analysis.
 
<!--T:43-->
* '''cuSignal''', which leverages CuPy, Numba, and the RAPIDS ecosystem for GPU accelerated signal processing. In some cases, cuSignal is a direct port of Scipy Signal to leverage GPU compute resources via CuPy but also contains Numba CUDA kernels for additional speedups for selected functions.
 
<!--T:44-->
* '''cuCIM''', an extensible toolkit designed to provide GPU accelerated I/O, computer vision & image processing primitives for N-Dimensional images with a focus on biomedical imaging.
 
<!--T:45-->
* '''RAPIDS Memory Manager (RMM)''', a central place for all device memory allocations in cuDF (C++ and Python) and other RAPIDS libraries. In addition, it is a replacement allocator for CUDA Device Memory (and CUDA Managed Memory) and a pool allocator to make CUDA device memory allocation / deallocation faster and asynchronous.
 
= Apptainer images= <!--T:4-->
 
<!--T:5-->
To build an Apptainer (formerly called [[Singularity#Please_use_Apptainer_instead|Singularity]]) image for RAPIDS, the first thing to do is to find and select a Docker image provided by NVIDIA.
 
==Finding a Docker image== <!--T:6-->
   
   
==Where to look for a Docker image for RAPIDS==
There are three types of RAPIDS Docker images: ''base'', ''runtime'', and ''devel''. For each type, multiple images are provided for different combinations of RAPIDS and CUDA versions, either on Ubuntu or on CentOS. You can find the Docker <tt>pull</tt> command for a selected image under the '''Tags''' tab on each site. 


<!--T:7-->
There are three types of RAPIDS Docker images, i.e. base, runtime, and devel types, and they are available at two major sites. For each type of Docker images, multiple images are provided with different combinations of RAPIDS versions and CUDA versions either in Ubuntu base or in CentOS base. You can find the Docker pull command of a selected image via the '''Tag''' tab on each given site: 
* [https://ngc.nvidia.com/catalog/containers/nvidia:rapidsai:rapidsai NVIDIA GPU Cloud (NGC)]
** '''base''' images contain a RAPIDS environment ready for use. Use this type of image if you want to submit a job to the Slurm scheduler.
** '''runtime''' images extend the base image by adding a Jupyter notebook server and example notebooks. Use this type of image if you want to interactively work with RAPIDS through notebooks and examples.   
* [https://hub.docker.com/r/rapidsai/rapidsai-dev Docker Hub]
** '''devel''' images contain the full RAPIDS source tree, the compiler toolchain, the debugging tools, the headers and the static libraries for RAPIDS development. Use this type of image if you want to implement customized operations with low-level access to cuda-based processes.


* [https://ngc.nvidia.com/catalog/containers/nvidia:rapidsai:rapidsai NVIDIA GPU Cloud (NGC)]: this site provides two types of RAPIDS images, i.e. base type and runtime type
<!--T:46-->
** base type - contains a RAPIDS environment ready to use. Use this type of image if you want to submit a job to the Slurm scheduler.
'''NOTE:''' Starting with the RAPIDS v23.08 release, '''base''' type images are available [https://catalog.ngc.nvidia.com/orgs/nvidia/teams/rapidsai/containers/base here], '''runtime''' type images are replaced by `notebooks` images and available [https://catalog.ngc.nvidia.com/orgs/nvidia/teams/rapidsai/containers/notebooks here], and '''devel''' type images are no longer supported.
** runtime type - extends the base image by adding a Jupyter notebook server and example notebooks. Use this type of image if you want to interactively work with RAPIDS through notebooks and examples.   
* [https://hub.docker.com/r/rapidsai/rapidsai-dev Docker Hub]: this site provides RAPIDS images in devel type.
** devel type - contains the full RAPIDS source tree, the compiler toolchain, the debugging tools, the headers and the static libraries for RAPIDS development. Use this type of image if you want to implement any customized operations with low-level access to cuda-based processes.


==Build a RAPIDS Singularity image==
==Building an Apptainer image== <!--T:8-->


For example, if a docker pull command for a selected image is given as:
<!--T:9-->
For example, if a Docker <tt>pull</tt> command for a selected image is given as  


<source lang="console"> docker pull nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7</source>  
<!--T:10-->
<source lang="console">docker pull nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7</source>  
   
   
On a computer that has Singularity supported, you can build a Singularity image, e.g. called ''rapids.sif'', with following command:  
on a computer that supports Apptainer, you can build an Apptainer image (here ''rapids.sif'') with the following command based on the <tt>pull</tt> tag:  
   
   
{{Command|singularity build rapids.sif docker://nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7 }}
<source lang="console">[name@server ~]$ apptainer build rapids.sif docker://nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7</source>


It usually takes half to one hour to complete the image building process. Since the image size is relatively large, you need to have enough memory and disk spaces on the server for building such an image.
<!--T:11-->
It usually takes from thirty to sixty minutes to complete the image-building process. Since the image size is relatively large, you need to have enough memory and disk space on the server to build such an image.


=Work on Clusters with a RAPIDS Singularity image=
=Working on clusters with an Apptainer image= <!--T:12-->
Once you have a Singularity image for RAPIDS located on Compute Canada clusters, you can work interactively by requesting an interactive session on a GPU node or submit a batch job to the Slurm queue if you have your RAPIDS code ready.
Once you have an Apptainer image for RAPIDS ready in your account, you can request an interactive session on a GPU node or submit a batch job to Slurm if you have your RAPIDS code ready.
==Explore the contents in RAPIDS==


If simply exploring the contents without doing any computations, you can use following commands to access the container shell of the Singularity image, e.g. called ''rapids.sif''  on any node without requesting any GPUs.
==Working interactively on a GPU node== <!--T:13-->


<!--T:5-->
<!--T:14-->
Load the Singularity module first:
If an Apptainer image was built based on a runtime or a devel type of Docker image, it includes a Jupyter Notebook server and can be used to explore RAPIDS interactively on a compute node with a GPU.<br>
<source lang="console">[name@server ~]$ module load singularity</source>
To request an interactive session on a compute node with a single GPU, e.g. a T4 type of GPU on Graham, run
<source lang="console">[name@gra-login ~]$ salloc --ntasks=1 --cpus-per-task=2 --mem=10G --gres=gpu:t4:1 --time=1:0:0 --account=def-someuser</source>


<!--T:6-->
<!--T:15-->
Then access the container shell:
Once the requested resource is granted, start the RAPIDS shell on the GPU node with
<source lang="console">[name@server ~]$ singularity shell rapids.sif</source>


The shell prompt is then changed to:
<!--T:16-->
<!--T:8-->
<source lang="console">[name@gra#### ~]$ module load apptainer
<source lang="console">Singularity>
[name@gra#### ~]$ apptainer shell --nv -B /home -B /project -B /scratch  rapids.sif
</source>
</source>
* the <tt>--nv</tt> option binds the GPU driver on the host to the container, so the GPU device can be accessed from inside the Apptainer container;
* the <tt>-B</tt> option binds any filesystem that you would like to access from inside the container.
<!--T:17-->
After the shell prompt changes to <tt>Apptainer></tt>, you can check the GPU stats in the container to make sure the GPU device is accessible with
<source lang="console">Apptainer> nvidia-smi</source>


Inside the singularity shell initiate Conda and activate RAPIDS environment:
<!--T:18-->
<!--T:9-->
Then to initiate Conda and activate the RAPIDS environment, run
<source lang="console">Singularity> source /opt/conda/etc/profile.d/conda.sh
<source lang="console">Apptainer> source /opt/conda/etc/profile.d/conda.sh
Singularity> conda activate rapids  
Apptainer> conda activate rapids
</source>
</source>


The shell prompt in the rapids env is then changed to:
 
<!--T:10-->
 
<source lang="console">(rapids) Singularity>  
<!--T:19-->
After the shell prompt changes to <tt>(rapids) Apptainer></tt>, you can launch the Jupyter Notebook server in the RAPIDS environment with the following command, and the URL of the Notebook server will be displayed after it starts successfully.
<source lang="console">(rapids) Apptainer> jupyter-lab --ip $(hostname -f) --no-browser
</source>
</source>


Then you can list available packages in the rapids env:
<!--T:47-->
<!--T:11-->
'''NOTE:''' Starting with the RAPIDS v23.08 release, after initiating Conda, there is no need to activate rapids as all packages are included in the base conda environment, e.g. you can launch the Jupyter Notebook server in the base environment with the following commands.
<source lang="console">(rapids) Singularity> conda list
</source>


To deactivate rapids env and exit from the container:
<!--T:48-->
<!--T:11-->
<source lang="console">Apptainer> source /opt/conda/etc/profile.d/conda.sh
<source lang="console">(rapids) Singularity> conda deactivate
Apptainer> jupyter-lab --ip $(hostname -f) --no-browser
Singularity> exit 
</source>
</source>


You are then back to the host shell.
<!--T:20-->
As there is no direct Internet connection on a compute node on Graham, you would need to set up an SSH tunnel with port forwarding between your local computer and the GPU node. See [[Advanced_Jupyter_configuration#Connecting_to_JupyterLab|detailed instructions for connecting to Jupyter Notebook]].


==Work interactively on a GPU node==
==Submitting a RAPIDS job to the Slurm scheduler== <!--T:21-->


==Submit a RAPIDS job to Slurm scheduler==
<!--T:22-->
You would need to prepare two script files to submit your RAPIDS code execution to the Slurm queue, i.e. one for job submission script and one for job execution script.
Once you have your RAPIDS code ready and want to submit a job execution request to the Slurm scheduler, you need to prepare two script files, i.e. a job submission script and a job execution script.


Here is an example of a job submission script, e.g. ''submit.sh'':
<!--T:23-->
'''Submission script'''
{{File
{{File
   |name=submit.sh
   |name=submit.sh
Line 90: Line 129:
   |contents=
   |contents=
#!/bin/bash
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --gres=gpu:t4:1
#SBATCH --gres=gpu:t4:1
#SBATCH --cpus-per-task=2
#SBATCH --cpus-per-task=2
Line 96: Line 134:
#SBATCH --time=dd:hh:mm
#SBATCH --time=dd:hh:mm
#SBATCH --account=def-someuser
#SBATCH --account=def-someuser
 
module load apptainer
module load singularity
apptainer exec --nv -B /home -B /scratch rapids.sif /path/to/run_script.sh
 
singularity run --nv -B /home -B /scratch rapids.sif /path/to/run_script.sh
}}
}}
Where '''--nv''' is to bind mount the GPU driver on the host to the container, so the GPU device can be accessed from inside the singularity container.


Here is an example of a job execution script, e.g. ''run_script.sh'', which you would like to run in the container to start the execution of the python code that has been programed with RAPIDS:
<!--T:24-->
'''Execution script'''
{{File
{{File
   |name=run_script.sh
   |name=run_script.sh
Line 110: Line 146:
#!/bin/bash
#!/bin/bash
source /opt/conda/etc/profile.d/conda.sh
source /opt/conda/etc/profile.d/conda.sh
conda activate rapids
conda activate rapids     # only needed if working with RAPIDS v.23.06 or under
nvidia-smi  
nvidia-smi  
python /path/to/my_rapids_code.py  
python /path/to/my_rapids_code.py  
}}
}}


=Helpful Links=
=Helpful links= <!--T:25-->


<references/>
<!--T:26-->
* [https://docs.rapids.ai/ RAPIDS Docs]: a collection of all the documentation for RAPIDS, how to stay connected and report issues;
* [https://github.com/rapidsai/notebooks RAPIDS Notebooks]: a collection of example notebooks on GitHub for getting started quickly;
* [https://medium.com/rapids-ai RAPIDS on Medium]: a collection of use cases and blogs for RAPIDS applications.


[[Category:Pages with video links]]
</translate>

Latest revision as of 20:12, 1 November 2023

Other languages:

Overview

RAPIDS is a suite of open source software libraries from NVIDIA mainly for executing data science and analytics pipelines in Python on GPUs. It relies on NVIDIA CUDA primitives for low-level compute optimization and provides friendly Python APIs, similar to those in Pandas or Scikit-learn.

The main components are:

  • cuDF, a Python GPU DataFrame library (built on the Apache Arrow columnar memory format) for loading, joining, aggregating, filtering, and otherwise manipulating data.
  • cuML, a suite of libraries that implement machine learning algorithms and mathematical primitive functions that share compatible APIs with other RAPIDS projects.
  • cuGraph, a GPU accelerated graph analytics library, with functionality like NetworkX, which is seamlessly integrated into the RAPIDS data science platform.
  • Cyber Log Accelerators (CLX or clicks), a collection of RAPIDS examples for security analysts, data scientists, and engineers to quickly get started applying RAPIDS and GPU acceleration to real-world cybersecurity use cases.
  • cuxFilter, a connector library, which provides the connections between different visualization libraries and a GPU dataframe without much hassle. This also allows you to use charts from different libraries in a single dashboard, while also providing the interaction.
  • cuSpatial, a GPU accelerated C++/Python library for accelerating GIS workflows including point-in-polygon, spatial join, coordinate systems, shape primitives, distances, and trajectory analysis.
  • cuSignal, which leverages CuPy, Numba, and the RAPIDS ecosystem for GPU accelerated signal processing. In some cases, cuSignal is a direct port of Scipy Signal to leverage GPU compute resources via CuPy but also contains Numba CUDA kernels for additional speedups for selected functions.
  • cuCIM, an extensible toolkit designed to provide GPU accelerated I/O, computer vision & image processing primitives for N-Dimensional images with a focus on biomedical imaging.
  • RAPIDS Memory Manager (RMM), a central place for all device memory allocations in cuDF (C++ and Python) and other RAPIDS libraries. In addition, it is a replacement allocator for CUDA Device Memory (and CUDA Managed Memory) and a pool allocator to make CUDA device memory allocation / deallocation faster and asynchronous.

Apptainer images

To build an Apptainer (formerly called Singularity) image for RAPIDS, the first thing to do is to find and select a Docker image provided by NVIDIA.

Finding a Docker image

There are three types of RAPIDS Docker images: base, runtime, and devel. For each type, multiple images are provided for different combinations of RAPIDS and CUDA versions, either on Ubuntu or on CentOS. You can find the Docker pull command for a selected image under the Tags tab on each site.

  • NVIDIA GPU Cloud (NGC)
    • base images contain a RAPIDS environment ready for use. Use this type of image if you want to submit a job to the Slurm scheduler.
    • runtime images extend the base image by adding a Jupyter notebook server and example notebooks. Use this type of image if you want to interactively work with RAPIDS through notebooks and examples.
  • Docker Hub
    • devel images contain the full RAPIDS source tree, the compiler toolchain, the debugging tools, the headers and the static libraries for RAPIDS development. Use this type of image if you want to implement customized operations with low-level access to cuda-based processes.

NOTE: Starting with the RAPIDS v23.08 release, base type images are available here, runtime type images are replaced by `notebooks` images and available here, and devel type images are no longer supported.

Building an Apptainer image

For example, if a Docker pull command for a selected image is given as

docker pull nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7

on a computer that supports Apptainer, you can build an Apptainer image (here rapids.sif) with the following command based on the pull tag:

[name@server ~]$ apptainer build rapids.sif docker://nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7

It usually takes from thirty to sixty minutes to complete the image-building process. Since the image size is relatively large, you need to have enough memory and disk space on the server to build such an image.

Working on clusters with an Apptainer image

Once you have an Apptainer image for RAPIDS ready in your account, you can request an interactive session on a GPU node or submit a batch job to Slurm if you have your RAPIDS code ready.

Working interactively on a GPU node

If an Apptainer image was built based on a runtime or a devel type of Docker image, it includes a Jupyter Notebook server and can be used to explore RAPIDS interactively on a compute node with a GPU.
To request an interactive session on a compute node with a single GPU, e.g. a T4 type of GPU on Graham, run

[name@gra-login ~]$ salloc --ntasks=1 --cpus-per-task=2 --mem=10G --gres=gpu:t4:1 --time=1:0:0 --account=def-someuser

Once the requested resource is granted, start the RAPIDS shell on the GPU node with

[name@gra#### ~]$ module load apptainer
[name@gra#### ~]$ apptainer shell --nv -B /home -B /project -B /scratch  rapids.sif
  • the --nv option binds the GPU driver on the host to the container, so the GPU device can be accessed from inside the Apptainer container;
  • the -B option binds any filesystem that you would like to access from inside the container.

After the shell prompt changes to Apptainer>, you can check the GPU stats in the container to make sure the GPU device is accessible with

Apptainer> nvidia-smi

Then to initiate Conda and activate the RAPIDS environment, run

Apptainer> source /opt/conda/etc/profile.d/conda.sh
Apptainer> conda activate rapids


After the shell prompt changes to (rapids) Apptainer>, you can launch the Jupyter Notebook server in the RAPIDS environment with the following command, and the URL of the Notebook server will be displayed after it starts successfully.

(rapids) Apptainer> jupyter-lab --ip $(hostname -f) --no-browser

NOTE: Starting with the RAPIDS v23.08 release, after initiating Conda, there is no need to activate rapids as all packages are included in the base conda environment, e.g. you can launch the Jupyter Notebook server in the base environment with the following commands.

Apptainer> source /opt/conda/etc/profile.d/conda.sh
Apptainer> jupyter-lab --ip $(hostname -f) --no-browser

As there is no direct Internet connection on a compute node on Graham, you would need to set up an SSH tunnel with port forwarding between your local computer and the GPU node. See detailed instructions for connecting to Jupyter Notebook.

Submitting a RAPIDS job to the Slurm scheduler

Once you have your RAPIDS code ready and want to submit a job execution request to the Slurm scheduler, you need to prepare two script files, i.e. a job submission script and a job execution script.

Submission script

File : submit.sh

#!/bin/bash
#SBATCH --gres=gpu:t4:1
#SBATCH --cpus-per-task=2
#SBATCH --mem=10G
#SBATCH --time=dd:hh:mm
#SBATCH --account=def-someuser
module load apptainer
apptainer exec --nv -B /home -B /scratch rapids.sif /path/to/run_script.sh


Execution script

File : run_script.sh

#!/bin/bash
source /opt/conda/etc/profile.d/conda.sh
conda activate rapids     # only needed if working with RAPIDS v.23.06 or under 
nvidia-smi 
python /path/to/my_rapids_code.py


Helpful links

  • RAPIDS Docs: a collection of all the documentation for RAPIDS, how to stay connected and report issues;
  • RAPIDS Notebooks: a collection of example notebooks on GitHub for getting started quickly;
  • RAPIDS on Medium: a collection of use cases and blogs for RAPIDS applications.