JupyterNotebook

From Alliance Doc
Revision as of 19:35, 23 October 2017 by Diane27 (talk | contribs)
Jump to navigation Jump to search
Other languages:

Introduction

Project Jupyter is an open source project, born out of the IPython Project, as it evolved to support interactive data science and scientific computing across all programming languages. The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text [1].

You can run Jupyter Notebook on the login node (not recommended) or the compute nodes (highly recommended). Note that login nodes impose various user- and process-based limits, so notebooks running there may be killed if they consume significant cpu-time or memory. To use a compute node you will have to submit a job requesting the number of CPUs (and, optionally, GPUs), the amount of memory, and the run time. Here, we give the instructions to submit a Jupyter Notebook job.

Some regional partners provide a web portal named JupyterHub to alleviate the users from having to create his or her own Jupyter Notebook setup. To know more, visit JupyterHub wiki page.

Install Jupyter Notebook

These instructions install Jupyter Notebook with the pip command in a Python virtual environment in your home directory. The following instructions are for Python 3.5.2, but you can also install the application for a different version by loading a different Python module.

  1. Load the Python module:
    Question.png
    [name@server ~]$ module load python/3.5.2
    
  2. Create a new Python virtual environment:
    Question.png
    [name@server ~]$ virtualenv $HOME/jupyter_py3
    
  3. Activate your newly created Python virtual environment:
    Question.png
    [name@server ~]$ source $HOME/jupyter_py3/bin/activate
    
  4. Install Jupyter into your virtual environment:
    Question.png
    (jupyter_py3)[name@server $] pip install jupyter
    
  5. In your virtual environment, create a wrapper script that launches Jupyter notebook
    Question.png
    (jupyter_py3)[name@server $] echo -e '#!/bin/bash\nunset XDG_RUNTIME_DIR\njupyter notebook --ip $(hostname -f) --no-browser' > $VIRTUAL_ENV/bin/notebook.sh
    
  6. Finally, make the script executable
    Question.png
    (jupyter_py3)[name@server $] chmod u+x $VIRTUAL_ENV/bin/notebook.sh
    

Install Extensions

Extensions allow you to add functionalities and modify the appearance of the Notebook application.

Jupyter Lmod

Jupyter Lmod is an extension that allows you to interact with environment modules before launching kernels. The extension use Lmod's Python interface to accomplish module-related tasks like loading, unloading, saving a collection, etc.

(jupyter_py3)[name@server $] pip install jupyterlmod
(jupyter_py3)[name@server $] jupyter nbextension install --py jupyterlmod --sys-prefix
(jupyter_py3)[name@server $] jupyter nbextension enable --py jupyterlmod --sys-prefix
(jupyter_py3)[name@server $] jupyter serverextension enable --py jupyterlmod --sys-prefix


RStudio Launcher

Jupyter can start an RStudio session that uses Jupyter's token authentication system. This extension adds an "RStudio Session" button to the New notebook menu.

(jupyter_py3)[name@server $] pip install nbserverproxy
(jupyter_py3)[name@server $] pip install git+https://github.com/cmd-ntrf/nbrsessionproxy
(jupyter_py3)[name@server $] jupyter serverextension enable --py nbserverproxy --sys-prefix
(jupyter_py3)[name@server $] jupyter nbextension install --py nbrsessionproxy --sys-prefix
(jupyter_py3)[name@server $] jupyter nbextension enable --py nbrsessionproxy --sys-prefix
(jupyter_py3)[name@server $] jupyter serverextension enable --py nbrsessionproxy --sys-prefix


Activate the environment

Once you have installed Jupyter, each time you log in to the cluster you need only re-load the Python module associated with your environment:

Question.png
[name@server ~]$ module load python/3.5.2

Then, activate the virtual environment in which you have installed Jupyter:

Question.png
[name@server ~]$ source $HOME/jupyter_py3/bin/activate

RStudio Server (optional)

If you have installed the RStudio launcher extension and wish to use it, you will have to load the RStudio Server module.

Question.png
(jupyter_py3)[name@server $] module load rstudio-server

Start the Notebook

To start the Notebook, submit an interactive job. Adjust the parameters based on your needs. See Running jobs for more information.

Question.png
[name@server ~]$ salloc --time=1:0:0 --ntasks=1 --cpus-per-task=2 --mem-per-cpu=1024M --account=def-yourpi srun notebook.sh
salloc: Granted job allocation 1422754
[I 14:07:08.661 NotebookApp] Serving notebooks from local directory: /home/fafor10
[I 14:07:08.662 NotebookApp] 0 active kernels
[I 14:07:08.662 NotebookApp] The Jupyter Notebook is running at:
[I 14:07:08.663 NotebookApp] http://cdr544.int.cedar.computecanada.ca:8888/?token=7ed7059fad64446f837567e32af8d20efa72e72476eb72ca
[I 14:07:08.663 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 14:07:08.669 NotebookApp]

Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://cdr544.int.cedar.computecanada.ca:8888/?token=7ed7059fad64446f837567e32af8d20efa72e72476eb72ca

Connect to the Notebook

To access the notebook running on a compute node from your web browser, you will need to create an SSH tunnel between the cluster and your computer since the compute nodes are not directly accessible from the Internet.

From Linux or MacOS X

On a Linux or MacOS X system we recommend using the Python package sshuttle.

On your computer, open a new terminal window, and run the following sshuttle command to create the tunnel

Question.png
[name@my_computer $] sshuttle --dns -Nr userid@machine_name

Then copy and paste the provided URL into your browser. In the example above this would be

 http://cdr544.int.cedar.computecanada.ca:8888/?token=7ed7059fad64446f837567e32af8d20efa72e72476eb72ca

From Windows

An SSH tunnel can be created from Windows using [MobaXTerm] as follows. Open two sessions in MobaXTerm.

Session 1 should be a connection to a cluster. Follow the instructions in Start the Notebook above to create a Jupyter notebook.

Session 2 should be a local terminal. In it we will set up the SSH tunnel. Run the following command, substituting the node name from the URL you received in Session 1. Following the example shown under Start the Notebook above:

Question.png
[name@my_computer ]$  ssh -L 8888:cdr544.int.cedar.computecanada.ca:8888 someuser@cedar.computecanada.ca

The above command means that you will do a local port forwarding (-L), then it says that we will forward our local port 8888 to cdr544.int.cedar.computecanada.ca:8888, the hostname that was given when we started Jupyter Notebook. Now open your browser and go to

 http://localhost:8888/?token=7ed7059fad64446f837567e32af8d20efa72e72476eb72ca

Replace the token in this example with the one given you by Jupyter in Session 1. You can also type http://localhost:8888 and there will be a prompt asking you for the token, which you can then copy and paste.

Shut down the Notebook

To shut down the Notebook server before the walltime limit, in the terminal that launched the interactive job, press Ctrl-C two times.

If you used MobaXterm to create a tunnel, press Ctrl-D in Session 2 to shut down the tunnel.

Add kernels

It is possible to add kernels for other programmming languages or Python versions different than the one running the Jupyter Notebook. Refer to kernels for Jupyter to know more. The installation of a new kernel is done in two steps. The first step is to install the packages that will allow the language interpreter to communicate with the Jupyter Notebook. The second step is to create a file that will indicate to Jupyter Notebook how to initiate a communication channel with the language interpreter. This file is called a kernel spec file.

Each kernel spec file have to be created in its own subfolder inside a folder in your home directory with the following path ~/.local/share/jupyter/kernels. Jupyter Notebook does not create this folder, so the first step in all cases is to create it. You can use the following command.

Question.png
[name@server ~]$ mkdir -p  ~/.local/share/jupyter/kernels

In the following sections, we provide a few examples of kernel installation procedure.

Anaconda

Before installing an Anaconda kernel, make you have read the documentation and installed Anaconda.

  1. Load Anaconda module
    Question.png
    [name@server ~]$ module load miniconda3
    
  2. Optional : Activate a specific conda virtual environment
    Question.png
    [name@server ~]$ source activate <your env>
    
  3. Install the ipykernel library
    Question.png
    [name@server ~]$ conda install ipykernel
    
  4. Generate the kernel spec file
    Question.png
    [name@server ~]$ python -m ipykernel install --user --name <unique identifier without white space> --display-name "My Anaconda 3 Kernel"
    
  5. Optional : Deactivate the virtual environment
    Question.png
    [name@server ~]$ source deactivate
    

For more information, refer to ipykernel documentation

Julia

  1. Load Julia module
    Question.png
    [name@server ~]$ module load julia
    
  2. Activate Jupyter Notebook virtual environment
    Question.png
    [name@server ~]$ source $HOME/jupyter_py3/bin/activate
    
  3. Install IJulia
    Question.png
    [name@server ~]$ echo 'Pkg.add("IJulia")' | julia
    

For more information, refer to IJulia documentation

R

  1. Load R module
    Question.png
    [name@server ~]$ module load r
    
  2. Activate Jupyter Notebook virtual environment
    Question.png
    [name@server ~]$ source $HOME/jupyter_py3/bin/activate
    
  3. Install R kernel dependencies
    Question.png
    [name@server ~]$ R -e "install.packages(c('crayon', 'pbdZMQ', 'devtools'), repos='http://cran.us.r-project.org')"
    
  4. Install R kernel
    Question.png
    [name@server ~]$ R -e "devtools::install_github(paste0('IRkernel/', c('repr', 'IRdisplay', 'IRkernel')))"
    
  5. Install R kernel spec file
    Question.png
    [name@server ~]$ R -e "IRkernel::installspec()"
    

For more information, refer to IRKernel documentation

References

  1. http://www.jupyter.org/, The Jupyter Notebook