Prolonging terminal sessions

From Alliance Doc
Revision as of 13:20, 27 March 2023 by Stubbsda (talk | contribs)
Jump to navigation Jump to search
Other languages:

For working on the clusters, most users will need to use SSH to connect to the cluster for job submission and monitoring, editing files and so forth. Keeping this SSH connection alive for a long period of time, ranging from hours to days, may be necessary for some users and this page proposes a variety of techniques for keeping such a terminal session alive.

SSH configuration

One simple solution is to modify the configuration of your SSH client to prolong the connection. On MacOS and Linux the client configuration is found in $HOME/.ssh/config while in Windows it is located in C:\Users\<username>\.ssh\config. Note that the file may not exist initially, so you will need to create it; you should add the lines

Host *
    ServerAliveInterval 240

This addition will ensure the transmission of a sign-of-life signal over the SSH connection to any remote server (such as an Alliance cluster) every 240 seconds, i.e. four minutes, which should help to keep your SSH connection alive even if it is idle for several hours.

Terminal multiplexers

The programs tmux and screen are examples of a terminal multiplexer - a program which allows you to detach your terminal session entirely, where it will keep on running on its own until you choose to reattach to it. With such a program, you can logout from the cluster, turn off the workstation or hibernate the laptop you use to connect to the cluster and when you're ready to start working again the next day, reattach to your session and start from right where you left off.

Login Node Dependency
Each of our clusters has several login nodes and your tmux or screen session is specific to a login node. If you wish to reattach to a session, you must ensure you're connected to the right login node which of course means remembering which login node you were using when you started tmux or screen. Login nodes may also occasionally be rebooted, which will kill any detached terminal sessions on that node.

tmux

The tmux software is a terminal multiplexer, allowing multiple virtual sessions in a single terminal session. You can thus disconnect from an SSH session without interrupting its process(es).

Here are some introductions to tmux:

Cheat sheet

For a complete reference, see this page.

Command Description
tmux Start a server
Ctrl+B D Disconnect from server
tmux a Reconnect to server
Ctrl+B C Create a new window
Ctrl+B N Go to next window
Ctrl+B [ Enable "copy" mode, allowing to scroll with the mouse and Page-Up Page-Down
Esc Disable "copy" mode

Launch tmux inside a job submitted through tmux

If you submit a job with tmux and try to start tmux within the same job, you will get the lost server error message. This happens because the $TMUX environment variable pointing to the tmux server is propagated to the job. The value of the variable is not valid and you can reset it with:

Question.png
[name@server ~]$ unset TMUX

However, nested use of tmux is not recommended. To send commands to a nested tmux, one has to hit Ctrl+B twice; for example, to create a new window, one has to use Ctrl+B Ctrl+B C. Consider using screen inside your job (if you are using tmux on a login node).

GNU Screen

The screen program is anpther widely used terminal multiplexer. To create a detached terminal session, you can use the following command

Question.png
[name@server ~]$ screen -S <session name>

It's a good idea to give a descriptive name to your terminal sessions, making it easier to identify them later. You can use the command screen -list to see a list of your detached terminal sessions on this node,

Question.png
[name@server ~]$ screen -list
There is a screen on:
        164133.foo      (Attached)
1 Socket in /tmp/S-stubbsda.

You can attach to one of your sessions using the command screen -d -r <session name>.