RAPIDS: Difference between revisions

m
replaced "Singularity" with "Apptainer"
m (Update the link to the page for connecting to the notebook server)
m (replaced "Singularity" with "Apptainer")
Line 35: Line 35:
* '''RAPIDS Memory Manager (RMM)''', a central place for all device memory allocations in cuDF (C++ and Python) and other RAPIDS libraries. In addition, it is a replacement allocator for CUDA Device Memory (and CUDA Managed Memory) and a pool allocator to make CUDA device memory allocation / deallocation faster and asynchronous.  
* '''RAPIDS Memory Manager (RMM)''', a central place for all device memory allocations in cuDF (C++ and Python) and other RAPIDS libraries. In addition, it is a replacement allocator for CUDA Device Memory (and CUDA Managed Memory) and a pool allocator to make CUDA device memory allocation / deallocation faster and asynchronous.  


= Singularity images= <!--T:4-->  
= Apptainer images= <!--T:4-->  


<!--T:5-->
<!--T:5-->
To build a Singularity image for RAPIDS, the first thing to do is to find and select a Docker image provided by NVIDIA.
To build an Apptainer (formerly called [https://docs.alliancecan.ca/wiki/Singularity/en#Please_use_Apptainer_instead Singularity] ) image for RAPIDS, the first thing to do is to find and select a Docker image provided by NVIDIA.


==Finding a Docker image== <!--T:6-->
==Finding a Docker image== <!--T:6-->
Line 51: Line 51:
** '''devel''' images contain the full RAPIDS source tree, the compiler toolchain, the debugging tools, the headers and the static libraries for RAPIDS development. Use this type of image if you want to implement customized operations with low-level access to cuda-based processes.
** '''devel''' images contain the full RAPIDS source tree, the compiler toolchain, the debugging tools, the headers and the static libraries for RAPIDS development. Use this type of image if you want to implement customized operations with low-level access to cuda-based processes.


==Building a Singularity image== <!--T:8-->
==Building an Apptainer image== <!--T:8-->


<!--T:9-->
<!--T:9-->
Line 59: Line 59:
<source lang="console">docker pull nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7</source>  
<source lang="console">docker pull nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7</source>  
   
   
on a computer that supports Singularity, you can build a Singularity image (here ''rapids.sif'') with the following command based on the <tt>pull</tt> tag:  
on a computer that supports Apptainer, you can build an Apptainer image (here ''rapids.sif'') with the following command based on the <tt>pull</tt> tag:  
   
   
<source lang="console">[name@server ~]$ singularity build rapids.sif docker://nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7</source>
<source lang="console">[name@server ~]$ apptainer build rapids.sif docker://nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7</source>


<!--T:11-->
<!--T:11-->
It usually takes from thirty to sixty minutes to complete the image-building process. Since the image size is relatively large, you need to have enough memory and disk space on the server to build such an image.
It usually takes from thirty to sixty minutes to complete the image-building process. Since the image size is relatively large, you need to have enough memory and disk space on the server to build such an image.


=Working on clusters with a Singularity image= <!--T:12-->
=Working on clusters with an Apptainer image= <!--T:12-->
Once you have a Singularity image for RAPIDS ready in your account, you can request an interactive session on a GPU node or submit a batch job to Slurm if you have your RAPIDS code ready.
Once you have an Apptainer image for RAPIDS ready in your account, you can request an interactive session on a GPU node or submit a batch job to Slurm if you have your RAPIDS code ready.


==Working interactively on a GPU node== <!--T:13-->  
==Working interactively on a GPU node== <!--T:13-->  


<!--T:14-->
<!--T:14-->
If a Singularity image was built based on a runtime or a devel type of Docker image, it includes a Jupyter Notebook server and can be used to explore RAPIDS interactively on a compute node with a GPU.<br>
If an Apptainer image was built based on a runtime or a devel type of Docker image, it includes a Jupyter Notebook server and can be used to explore RAPIDS interactively on a compute node with a GPU.<br>
To request an interactive session on a compute node with a single GPU, e.g. a T4 type of GPU on Graham, run
To request an interactive session on a compute node with a single GPU, e.g. a T4 type of GPU on Graham, run
<source lang="console">[name@gra-login ~]$ salloc --ntasks=1 --cpus-per-task=2 --mem=10G --gres=gpu:t4:1 --time=1:0:0 --account=def-someuser</source>
<source lang="console">[name@gra-login ~]$ salloc --ntasks=1 --cpus-per-task=2 --mem=10G --gres=gpu:t4:1 --time=1:0:0 --account=def-someuser</source>
Line 80: Line 80:


<!--T:16-->
<!--T:16-->
<source lang="console">[name@gra#### ~]$ module load singularity
<source lang="console">[name@gra#### ~]$ module load apptainer
[name@gra#### ~]$ singularity shell --nv -B /home -B /project -B /scratch  rapids.sif
[name@gra#### ~]$ apptainer shell --nv -B /home -B /project -B /scratch  rapids.sif
</source>
</source>
* the <tt>--nv</tt> option binds the GPU driver on the host to the container, so the GPU device can be accessed from inside the Singularity container;
* the <tt>--nv</tt> option binds the GPU driver on the host to the container, so the GPU device can be accessed from inside the Apptainer container;
* the <tt>-B</tt> option binds any filesystem that you would like to access from inside the container.
* the <tt>-B</tt> option binds any filesystem that you would like to access from inside the container.


<!--T:17-->
<!--T:17-->
After the shell prompt changes to <tt>Singularity></tt>, you can check the GPU stats in the container to make sure the GPU device is accessible with
After the shell prompt changes to <tt>Apptainer></tt>, you can check the GPU stats in the container to make sure the GPU device is accessible with
<source lang="console">Singularity> nvidia-smi</source>
<source lang="console">Apptainer> nvidia-smi</source>


<!--T:18-->
<!--T:18-->
Then to initiate Conda and activate the RAPIDS environment, run
Then to initiate Conda and activate the RAPIDS environment, run
<source lang="console">Singularity> source /opt/conda/etc/profile.d/conda.sh
<source lang="console">Apptainer> source /opt/conda/etc/profile.d/conda.sh
Singularity> conda activate rapids   
Apptainer> conda activate rapids   
</source>
</source>


<!--T:19-->
<!--T:19-->
After the shell prompt changes to <tt>(rapids) Singularity></tt>, you can launch the Jupyter Notebook server in the RAPIDS environment with the following command, and the URL of the Notebook server will be displayed after it starts successfully.  
After the shell prompt changes to <tt>(rapids) Apptainer></tt>, you can launch the Jupyter Notebook server in the RAPIDS environment with the following command, and the URL of the Notebook server will be displayed after it starts successfully.  
<source lang="console">(rapids) Singularity> jupyter-lab --ip $(hostname -f) --no-browser  
<source lang="console">(rapids) Apptainer> jupyter-lab --ip $(hostname -f) --no-browser  
</source>
</source>


Line 121: Line 121:
#SBATCH --time=dd:hh:mm
#SBATCH --time=dd:hh:mm
#SBATCH --account=def-someuser
#SBATCH --account=def-someuser
module load singularity
module load apptainer
singularity exec --nv -B /home -B /scratch rapids.sif /path/to/run_script.sh
apptainer exec --nv -B /home -B /scratch rapids.sif /path/to/run_script.sh
}}
}}


cc_staff
123

edits