JupyterHub: Difference between revisions
(Link to the Jupyter page in introduction) |
No edit summary |
||
Line 33: | Line 33: | ||
<!--T:18--> | <!--T:18--> | ||
'''<sup id="clusters_note">‡</sup> Note that the compute nodes running the Jupyter kernels do not have internet access'''. This means that you can only transfer files from/to your own computer; you cannot download code or data from the internet (e.g. cannot do "git clone", cannot do "pip install" if the wheel is absent from our wheelhouse). You may also have problems if your code performs downloads or uploads (e.g. in machine learning where downloading data from the code is often seen). | '''<sup id="clusters_note">‡</sup> Note that the compute nodes running the Jupyter kernels do not have internet access'''. This means that you can only transfer files from/to your own computer; you cannot download code or data from the internet (e.g. cannot do "git clone", cannot do "pip install" if the wheel is absent from our [[Available Python wheels|wheelhouse]]). You may also have problems if your code performs downloads or uploads (e.g. in machine learning where downloading data from the code is often seen). | ||
== JupyterHub for universities and schools == <!--T:12--> | == JupyterHub for universities and schools == <!--T:12--> | ||
Line 108: | Line 108: | ||
[[File:JupyterLab_Softwares.png|thumb|Loaded modules and available modules]] | [[File:JupyterLab_Softwares.png|thumb|Loaded modules and available modules]] | ||
* '''''Software''''' (blue diamond sign): | * '''''Software''''' (blue diamond sign): | ||
** Compute Canada modules can be loaded and unloaded in the JupyterLab session. Depending on the | ** Compute Canada modules can be loaded and unloaded in the JupyterLab session. Depending on the modules loaded, icons directing to the corresponding [[#Prebuilt_Applications|Jupyter applications]] will appear in the ''Launcher'' tab. | ||
** The search box can search for any [[Available software|available module]] and give the result in the ''Available Modules'' sub-panel. Note: some modules are hidden until their dependency is loaded - we recommend that you first look for a specific module with <code>module spider module_name</code> from a terminal. | ** The search box can search for any [[Available software|available module]] and give the result in the ''Available Modules'' sub-panel. Note: some modules are hidden until their dependency is loaded - we recommend that you first look for a specific module with <code>module spider module_name</code> from a terminal. | ||
** The next sub-panel is the list of ''Loaded Modules'' in the whole JupyterLab session. Note: while <code>python</code> and <code>ipython-kernel</code> modules are loaded by default, additional modules must be loaded before launching some other applications or notebooks. For example: <code>scipy-stack</code>. | ** The next sub-panel is the list of ''Loaded Modules'' in the whole JupyterLab session. Note: while <code>python</code> and <code>ipython-kernel</code> modules are loaded by default, additional modules must be loaded before launching some other applications or notebooks. For example: <code>scipy-stack</code>. |
Revision as of 22:09, 16 December 2021
"JupyterHub is the best way to serve Jupyter notebook for multiple users. It can be used in a class of students, a corporate data science group or scientific research group." [1]
JupyterHub provides a preconfigured version of JupyterLab and/or Jupyter Notebook; for more configuration options, please check the Jupyter page.
Compute Canada initiatives
Some regional initiatives offer access to computing resources through JupyterHub.
JupyterHub on clusters
On the following clusters‡, use your Compute Canada username and password to connect to JupyterHub:
JupyterHub | Comments |
---|---|
Béluga | Provides access to JupyterLab servers spawned through jobs on the cluster Béluga |
Cedar | Provides access to JupyterLab servers spawned through jobs on the cluster Cedar. The authentication is done on idpmfa.mit.c3.ca |
Hélios | Provides access to Jupyter Notebook servers spawned through jobs on the GPU cluster Hélios |
Narval | Provides access to JupyterLab servers spawned through jobs on the cluster Narval |
Niagara | This is a node which has been designated as a Jupyter Hub and it can run Jupyter Notebook sessions. To learn more, see the SciNet JupyterHub wiki page |
‡ Note that the compute nodes running the Jupyter kernels do not have internet access. This means that you can only transfer files from/to your own computer; you cannot download code or data from the internet (e.g. cannot do "git clone", cannot do "pip install" if the wheel is absent from our wheelhouse). You may also have problems if your code performs downloads or uploads (e.g. in machine learning where downloading data from the code is often seen).
JupyterHub for universities and schools
- The Pacific Institute for the Mathematical Sciences in collaboration with Compute Canada and Cybera offer cloud-based hubs to universities and schools. Each institution can have its own hub where users authenticate with their credentials from that institution. The hubs are hosted on the Compute Canada Cloud and are essentially for training purposes. Institutions interested in obtaining their own hub can visit http://syzygy.ca. See Compute Canada and PIMS launch Jupyter service for researchers.
Server Options
Once logged in, depending on the configuration of JupyterHub, the user's Web browser is redirected to either a) a previously launched Jupyter server, b) a new Jupyter server with default options, or c) a form that allows a user to set different options for their Jupyter server before pressing the Start button. In all cases, it is similar to accessing requested resources via an interactive job.
Compute resources
For example, Server Options available on Béluga's JupyterHub are:
- Account to be used: any
def-*
,rrg-*
,rpp-*
orctb-*
account a user has access to - Time (hours) required for the session
- Number of (CPU) cores that will be reserved on a single node
- Memory (MB) limit for the entire session
- (Optional) GPU configuration: at least one GPU
- User interface (see below)
User Interface
While JupyterHub allows each user to use one Jupyter server at a time on each hub, there can be multiple options under User interface:
- Jupyter Notebook (classic interface) - Even though it offers many functionalities, the community is moving towards JupyterLab, which is a better platform that offers many more features
- JupyterLab (modern interface) - This is the most recommended Jupyter user interface for interactive prototyping and data visualization
- Terminal (for a single terminal only) - It gives access to a terminal connected to a remote account, which is comparable to connecting to a server through an SSH connection
Note: JupyterHub could have also been configured to force a specific user interface. This is usually done for special events.
JupyterLab
JupyterLab is now the recommended general-purpose user interface to use on a JupyterHub. From a JupyterLab server, you can manage your remote files and folders, and you can launch Jupyter applications like a terminal, (Python 3) notebooks, RStudio and a Linux desktop.
The JupyterLab Interface
When JupyterLab is ready to be used, the interface has multiple panels.
Menu bar on top
- In the File menu:
- Hub Control Panel: if you want to manually stop the JupyterLab server and the corresponding job on the cluster. This is useful when you want to start a new JupyterLab server with more or less resources
- Log Out: the JupyterHub session will end, which will also stop the JupyterLab server and the corresponding job on the cluster
- Most other menu items are related to notebooks and Jupyter applications
Tool selector on left
- File Browser (folder icon):
- This is where you can browse in your home, project and scratch spaces
- It is also possible to upload files
- Running Terminals and Kernels (stop icon):
- To stop kernel sessions and terminal sessions
- Commands
- Property Inspector
- Open Tabs:
- To navigate between application tabs
- To close application tabs - the corresponding kernels remain active
- Software (blue diamond sign):
- Compute Canada modules can be loaded and unloaded in the JupyterLab session. Depending on the modules loaded, icons directing to the corresponding Jupyter applications will appear in the Launcher tab.
- The search box can search for any available module and give the result in the Available Modules sub-panel. Note: some modules are hidden until their dependency is loaded - we recommend that you first look for a specific module with
module spider module_name
from a terminal. - The next sub-panel is the list of Loaded Modules in the whole JupyterLab session. Note: while
python
andipython-kernel
modules are loaded by default, additional modules must be loaded before launching some other applications or notebooks. For example:scipy-stack
. - The last sub-panel is the list of Available modules, similar to the output of
module avail
. By clicking on a module's name, detailed information about the module is displayed. By clicking on the Load link, the module will be loaded and added to the Loaded Modules list.
Application area on right
- The Launcher tab is opened by default
- It contains all available Jupyter applications and notebooks, depending on which modules are loaded
Status bar at the bottom
- By clicking on the icons, this brings you to the Running Terminals and Kernels tool.
Prebuilt Applications
JupyterLab offers access to a terminal, an IDE (Desktop), a Python console and different options to create text and Markdown files. This section presents only the main supported Jupyter applications that work with the Compute Canada software stack.
Command Line Interpreters
Julia Console
To enable the Julia 1.x console launcher, an ijulia-kernel
module needs to be loaded. When launched, a Julia interpreter is presented in a new JupyterLab tab.
Python Console
The Python 3.x console launcher is available by default in a new JupyterLab session. When launched, a Python 3 interpreter is presented in a new JupyterLab tab.
Terminal
This application launcher will open a terminal in a new JupyterLab tab:
- The terminal runs a (Bash) shell on the remote compute node without the need of an SSH connection
- Gives access to the remote filesystems (
/home
,/project
,/scratch
) - Allows running compute tasks
- Gives access to the remote filesystems (
- The terminal allows copy-and-paste operations of text:
- Copy operation: select the text, then press Ctrl+C
- Note: usually, Ctrl+C is used to send a SIGINT signal to a running process, or to cancel the current command. To get this behaviour in JupyterLab's terminal, click on the terminal to deselect any text before pressing Ctrl+C
- Paste operation: press Ctrl+V
- Copy operation: select the text, then press Ctrl+C
Available Notebook Kernels
Julia Notebook
To enable the Julia 1.x notebook launcher, an ijulia-kernel
module needs to be loaded. When launched, a Julia notebook is presented in a new JupyterLab tab.
Python Notebook
If any of the following scientific Python packages is required by your notebook, before you open this notebook, you must load the scipy-stack
module from the JupyterLab Softwares tool:
ipython
,ipython_genutils
,ipykernel
,ipyparallel
matplotlib
numpy
pandas
scipy
- Other notable packages:
Cycler
,futures
,jupyter_client
,jupyter_core
,mpmath
,pathlib2
,pexpect
,pickleshare
,ptyprocess
,pyzmq
,simplegeneric
,sympy
,tornado
,traitlets
- And many more (click on the
scipy-stack
module to see all Included extensions)
Note: you may also install needed packages by running for example the following command inside of a cell: !pip install --no-index numpy
- For some packages (like
plotly
, for example), you may need to restart the notebook's kernel before importing the package. - The installation of packages in the default Python kernel environment is temporary to the lifetime of the JupyterLab session; you will have to reinstall these packages the next time you start a new JupyterLab session. For a persistent Python environment, you must configure a custom Python kernel.
To open an existing Python notebook:
- Go back to the File Browser
- Browse to the location of the
*.ipynb
file - Double-click on the
*.ipynb
file:- This will open the Python notebook in a new JupyterLab tab
- An IPython kernel will start running in background for this notebook
To open a new Python notebook in the current File Browser directory:
- Click on the Python 3.x launcher under the Notebook section:
- This will open a new Python 3 notebook in a new JupyterLab tab
- A new IPython kernel will start running in background for this notebook
Other Applications
OpenRefine
To enable the OpenRefine application launcher, an openrefine
module needs to be loaded. Depending on the software environment version, the latest version of OpenRefine should be loaded:
- With
StdEnv/2020
, load module:openrefine/3.4.1
- With
StdEnv/2018.3
, load module:openrefine/3.3
This OpenRefine launcher will open or reopen an OpenRefine interface in a new Web browser tab:
- It is possible to reopen an active OpenRefine session after the Web browser tab was closed
- The OpenRefine session will end when the JupyterLab session will end
RStudio
To enable the RStudio application launcher, the following three modules need to be loaded:
gcc
r
rstudio-server
Depending on the software environment version, you should load the following two modules (r
is loaded automatically):
- With
StdEnv/2020
, it is not yet supported - With
StdEnv/2018.3
, load modules:gcc/7.3.0
,rstudio-server/1.2.1335
- With
StdEnv/2016.4
, load modules:gcc/7.3.0
,rstudio-server/1.2.1335
This RStudio launcher will open or reopen an RStudio interface in a new Web browser tab:
- It is possible to reopen an active RStudio session after the Web browser tab was closed
- The RStudio session will end when the JupyterLab session will end
VS Code
To enable the VS Code (Visual Studio Code) application launcher, a code-server
module needs to be loaded. Depending on the software environment version, the latest version of VS Code should be loaded:
- With
StdEnv/2020
, load module:code-server/3.5.0
- With
StdEnv/2018.3
, load module:code-server/3.4.1
This VS Code launcher will open or reopen the VS Code interface in a new Web browser tab:
- For a new session, the VS Code session can take up to 3 minutes to complete its startup.
- It is possible to reopen an active VS Code session after the Web browser tab was closed
- The VS Code session will end when the JupyterLab session will end
Desktop
This Desktop launcher will open or reopen a remote Linux desktop interface in a new Web browser tab:
- This is equivalent to running a VNC server on a compute node, then creating an SSH tunnel and finally using a VNC client, but you need nothing of all this with JupyterLab!
- It is possible to reopen an active desktop session after the Web browser tab was closed
- The desktop session will end when the JupyterLab session will end
Possible error messages
Most JupyterHub errors are caused by the underlying job scheduler which is either unresponsive or not able to find appropriate resources for your session. For example:
- A "time-out" error message when starting a JupyterLab session:
- Just like any interactive job on any cluster, a longer requested time can cause a longer wait time in queue
- There may be no available interactive node at the moment. You should then try at another moment