RAPIDS: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 36: Line 36:
<source lang="console">docker pull nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7</source>  
<source lang="console">docker pull nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7</source>  
   
   
on a computer that has supports Singularity, you can build a Singularity image (here ''rapids.sif'') with the following command based on the given pull tag:  
on a computer that supports Singularity, you can build a Singularity image (here ''rapids.sif'') with the following command based on the pull tag:  
   
   
<source lang="console">[name@server ~]$ singularity build rapids.sif docker://nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7</source>
<source lang="console">[name@server ~]$ singularity build rapids.sif docker://nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7</source>


<!--T:11-->
<!--T:11-->
It usually takes from thirty to sixty minutes to complete the image building process. Since the image size is relatively large, you need to have enough memory and disk space on the server to build such an image.
It usually takes from thirty to sixty minutes to complete the image-building process. Since the image size is relatively large, you need to have enough memory and disk space on the server to build such an image.


=Working on clusters with a Singularity image= <!--T:12-->
=Working on clusters with a Singularity image= <!--T:12-->
Line 48: Line 48:


<!--T:13-->
<!--T:13-->
To explore the contents without doing any computations, you can use following commands to access the container shell of the Singularity image (here''rapids.sif'') on any node without requesting a GPU.
To explore the contents without doing any computations, you can use the following commands to access the container shell of the Singularity image (here''rapids.sif'') on any node without requesting a GPU.


   
   
Line 74: Line 74:


<!--T:18-->
<!--T:18-->
The shell prompt in the rapids env is then changed to
The shell prompt in the RAPIDS environment is then changed to
   
   
<source lang="console">(rapids) Singularity>  
<source lang="console">(rapids) Singularity>  
Line 80: Line 80:


<!--T:19-->
<!--T:19-->
Then you can list available packages in the rapids env with
Then you can list available packages in the RAPIDS environment with
   
   
<source lang="console">(rapids) Singularity> conda list  
<source lang="console">(rapids) Singularity> conda list  
Line 86: Line 86:


<!--T:20-->
<!--T:20-->
To deactivate rapids env and exit from the container, run
To deactivate the RAPIDS environment and exit from the container, run
   
   
<source lang="console">(rapids) Singularity> conda deactivate
<source lang="console">(rapids) Singularity> conda deactivate
Line 105: Line 105:


<!--T:25-->
<!--T:25-->
Once the requested resource is granted, start RAPIDS shell on the GPU node with
Once the requested resource is granted, start the RAPIDS shell on the GPU node with


<!--T:26-->
<!--T:26-->
Line 111: Line 111:
[name@gra#### ~]$ singularity shell --nv -B /home -B /project -B /scratch  rapids.sif
[name@gra#### ~]$ singularity shell --nv -B /home -B /project -B /scratch  rapids.sif
</source>
</source>
* the <tt>--nv</tt> option is to bind the GPU driver on the host to the container, so the GPU device can be accessed from inside the Singularity container;
* the <tt>--nv</tt> option binds the GPU driver on the host to the container, so the GPU device can be accessed from inside the Singularity container;
* the <tt>-B</tt> option is to bind any filesystem that you would like to access from inside the container.
* the <tt>-B</tt> option binds any filesystem that you would like to access from inside the container.


<!--T:27-->
<!--T:27-->
Line 119: Line 119:


<!--T:28-->
<!--T:28-->
Then to initiate Conda and activate the rapids env, run
Then to initiate Conda and activate the RAPIDS environment, run
<source lang="console">Singularity> source /opt/conda/etc/profile.d/conda.sh
<source lang="console">Singularity> source /opt/conda/etc/profile.d/conda.sh
Singularity> conda activate rapids   
Singularity> conda activate rapids   
Line 125: Line 125:


<!--T:29-->
<!--T:29-->
After the shell prompt changes to <tt>(rapids) Singularity></tt>, you can launch the Jupyter Notebook server in the rapids env with following command, and the URL of the Notebook server is displayed after it starts successfully:
After the shell prompt changes to <tt>(rapids) Singularity></tt>, you can launch the Jupyter Notebook server in the RAPIDS environment with the following command, and the URL of the Notebook server will be displayed after it starts successfully.
<source lang="console">(rapids) Singularity> jupyter-lab --ip $(hostname -f) --no-browser
<source lang="console">(rapids) Singularity> jupyter-lab --ip $(hostname -f) --no-browser
[I 22:28:20.215 LabApp] JupyterLab extension loaded from /opt/conda/envs/rapids/lib/python3.7/site-packages/jupyterlab
[I 22:28:20.215 LabApp] JupyterLab extension loaded from /opt/conda/envs/rapids/lib/python3.7/site-packages/jupyterlab
Line 137: Line 137:


     <!--T:30-->
     <!--T:30-->
To access the notebook, open this file in a browser:
To access the notebook, open this file in a browser
         file:///home/jhqin/.local/share/jupyter/runtime/nbserver-76967-open.html
         file:///home/jhqin/.local/share/jupyter/runtime/nbserver-76967-open.html
     Or copy and paste one of these URLs:
     Or copy and paste one of these URLs
         http://gra1160.graham.sharcnet:8888/?token=5d4b75bf2ec3481fab1b625656a322afc96775440b7bb8c4
         http://gra1160.graham.sharcnet:8888/?token=5d4b75bf2ec3481fab1b625656a322afc96775440b7bb8c4
     or http://127.0.0.1:8888/?token=5d4b75bf2ec3481fab1b625656a322afc96775440b7bb8c4
     or http://127.0.0.1:8888/?token=5d4b75bf2ec3481fab1b625656a322afc96775440b7bb8c4
</source>
</source>
Where the URL for the notebook server in above example is:
Where the URL for the notebook server in above example is
<source lang="console">http://gra1160.graham.sharcnet:8888/?token=5d4b75bf2ec3481fab1b625656a322afc96775440b7bb8c4</source>
<source lang="console">http://gra1160.graham.sharcnet:8888/?token=5d4b75bf2ec3481fab1b625656a322afc96775440b7bb8c4</source>
As there is no direct Internet connection on a compute node on Graham, you would need to set up an SSH tunnel with port forwarding between your local computer and the GPU node. See [[Jupyter#Connecting_to_Jupyter_Notebook|detailed instructions for connecting to Jupyter Notebook]].
As there is no direct Internet connection on a compute node on Graham, you would need to set up an SSH tunnel with port forwarding between your local computer and the GPU node. See [[Jupyter#Connecting_to_Jupyter_Notebook|detailed instructions for connecting to Jupyter Notebook]].


==Submitting a RAPIDS job to Slurm scheduler== <!--T:31-->
==Submitting a RAPIDS job to the Slurm scheduler== <!--T:31-->
Once you have your RAPIDS code ready and would like to submit a job execution request to the Slurm scheduler, you need to prepare two script files, i.e. a job submission script and a job execution script.
Once you have your RAPIDS code ready and want to submit a job execution request to the Slurm scheduler, you need to prepare two script files, i.e. a job submission script and a job execution script.


<!--T:32-->
<!--T:32-->
Here is an example of a job submission script, e.g. ''submit.sh'':
Here is an example of a job submission script (here''submit.sh''):
{{File
{{File
   |name=submit.sh
   |name=submit.sh
Line 171: Line 171:
}}
}}
   
   
Here is an example of job execution script ''run_script.sh'' which you would like to run in the container to start the execution of the Python code programmed with RAPIDS:  
Here is an example of job execution script (here ''run_script.sh'') which you want to run in the container to start the execution of the Python code programed with RAPIDS:  
{{File
{{File
   |name=run_script.sh
   |name=run_script.sh
rsnt_translations
56,430

edits