JupyterNotebook

From Alliance Doc
Jump to navigation Jump to search


This article is a draft

This is not a complete article: This is a draft, a work in progress that is intended to be published into an article, which may or may not be ready for inclusion in the main wiki. It should not necessarily be considered factual or authoritative.




Introduction

Jupyter notebook comes in one Python model on Graham. You can get it working on the login node (not recommended), and the compute nodes (highly recommended). Note that Login nodes impose various user- and process-based limits, so notebooks running there may be killed if they consume significant cpu-time or memory. You will have to submit a job requesting the # of CPU (or even GPU), amount of memory and runtime. Here, we give the instructions to submit a Jupyter job.

Installing Jupyter Notebook

These instructions install Jupyter into your home directory. To install Jupyter we will use the pip command and install it into a Python virtual environments. The below instructions install for Python 3.5.2. The below instructions install for Python 3.5.2 but you can also install for Python 3.5.Y or 2.7.X by loading a different Python module.

Load the Python module:

Question.png
[name@server ~]$ module load python/3.5.2

Create a new python virtual environment:

Question.png
[name@server ~]$ virtualenv jupyter_py3

Activate your newly created python virtual environment:

Question.png
[name@server ~]$ source jupyter_py3/bin/activate

Install Jupyter into your newly created virtual environment:

Question.png
(jupyter_py3)[name@server $] pip install jupyter

Connecting to a manually spawned Jupyter Notebook

Create a Tunnel

To access the notebook running on a compute node from your web browser, you will need to create a tunnel between the cluster and your computer since the compute nodes are not directly accessible from the Internet. To create that tunnel, we recommend the usage of the Python package sshuttle.

On your computer, open a new terminal window, and run the following sshuttle command to create the tunnel

Question.png
[name@my_computer $] sshuttle --dns -Nr userid@machine_name

Load the module

Log onto graham.sharcnet.ca (or graham.computecanada.ca) or cedar.computecanada.ca and load the python module

ssh user@{graham|cedar}.computecanada.ca
module load python35-scipy-stack

Install Python modules

Refer to the Python documentation.

Submit the job

Create a bash script for submitting a Jupyter job on the slurm scheduler, i.e., slurm_jupyter.sh and add the following. Note do not forget to change the lines with <> since you need to choose your email (line 10) and your account (line 11).

{{File

 |name=slurm_jupyter.sh
 |lang="sh"
 |lines=yes
 |contents=
  1. !/bin/bash
  2. SBATCH --gres=gpu:1 #number of GPUs if needed
  3. SBATCH --time=0-01:00 #time in dd-hr:mm
  4. SBATCH --nodes 1 #nodes
  5. SBATCH --ntasks-per-node 2 #cores per node
  6. SBATCH --mem-per-cpu 4000 #mem in MB
  7. SBATCH --job-name tunnel #name of the job
  8. SBATCH --output jupyter-log-%J.txt #output file
  9. SBATCH --mail-type=BEGIN #send email when job begins
  10. SBATCH --mail-user=<email_to_notify> #to this email address
  11. SBATCH --account=<account> #account
  1. load cuda (remove if you don't need GPUs) and python modules

module load cuda module load python35-scipy-stack

    1. get tunneling info

XDG_RUNTIME_DIR=""

  1. choose a random port

ipnport=$(shuf -i8000-9999 -n1)

  1. get hostname's IP and remove whitespaces

ipnip=$(hostname -i | xargs)

    1. print tunneling instructions to jupyter-log-{jobid}.txt

echo -e "

   Copy/Paste this in your local terminal to ssh tunnel with remote
   -----------------------------------------------------------------
   sshuttle -r $USER@${CC_CLUSTER}.computecanada.ca -v $ipnip/24
   -----------------------------------------------------------------
   Then open a browser on your local machine to the following address
   ------------------------------------------------------------------
   http://$ipnip:$ipnport  (prefix w/ https:// if using password)
   ------------------------------------------------------------------
   "
    1. start an ipcluster instance and launch jupyter server

jupyter-notebook --no-browser --port=$ipnport --ip=$ipnip }}

Submit the previous script

sbatch slurm_jupyter.sh

Example on Graham

Once you submit the job, and this job has started, you can check the instructions in the log file jupyter-log-*.txt. For example, let's say that you submit a job with a JOBID 131620, then you will have a file jupyter-log-131620.txt which will have something like the following

    Copy/Paste this in your local terminal to ssh tunnel with remote
    -----------------------------------------------------------------
    sshuttle -r jnandez@graham.computecanada.ca -v 10.29.76.27/24
    -----------------------------------------------------------------

    Then open a browser on your local machine to the following address
    ------------------------------------------------------------------
    http://10.29.76.27:9799  (prefix w/ https:// if using password)
    ------------------------------------------------------------------
    
[I 13:58:14.393 NotebookApp] Serving notebooks from local directory: /home/jnandez/jupyter_notebook_HPC
[I 13:58:14.394 NotebookApp] 0 active kernels 
[I 13:58:14.394 NotebookApp] The Jupyter Notebook is running at: http://10.29.76.27:9799/

Opening in a web browser

Open your local browser and type,

http://10.29.76.27:9799


Example on Cedar

Once you submit the job, and this job has started, you can check the instructions in the log file jupyter-log-*.txt. For example, let's say that you submit a job with a JOBID 1418055, then you will have a file jupyter-log-1418055.txt which will have something like the following


    Copy/Paste this in your local terminal to ssh tunnel with remote
    -----------------------------------------------------------------
    sshuttle -r jnandez@cedar.computecanada.ca -v 172.16.136.103/24
    -----------------------------------------------------------------

    Then open a browser on your local machine to the following address
    ------------------------------------------------------------------
    http://172.16.136.103:8975  (prefix w/ https:// if using password)
    ------------------------------------------------------------------
    
[I 09:31:44.233 NotebookApp] Serving notebooks from local directory: /home/jnandez/jupyter_notebook_HPC
[I 09:31:44.233 NotebookApp] 0 active kernels 
[I 09:31:44.233 NotebookApp] The Jupyter Notebook is running at: http://172.16.136.103:8975/
[I 09:31:44.233 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

Opening in a web browser

Open your local browser and type,

http://172.16.136.103:8975

References