Singularity

From Alliance Doc
Jump to navigation Jump to search
Other languages:

Overview[edit]

Singularity[1] is open source software created by Berkeley Lab:

  • as a secure way to use Linux containers on Linux multi-user clusters,
  • as a way to enable users to have full control of their environment, and,
  • as a way to package scientific software and deploy such to different clusters having the same architecture.

i.e., it provides operating-system-level virtualization commonly called containers.

A container is different from a virtual machine in that a container:

  • likely has less overhead, and,
  • can only run programs capable of running in the same operating system (i.e., Linux when using Singularity) for the same hardware architecture.

(Virtual machines can run different operating systems and sometimes support running software designed for foreign CPU architectures.)

Containers use Linux control groups (cgroups), kernel namespaces, and an overlay filesystem where:

  • cgroups limit, control, and isolate resource usage (e.g., RAM, disk I/O, CPU access)
  • kernel namespaces virtualize and isolate operating system resources of a group of processes, e.g., process and user IDs, filesystems, network access; and,
  • overlay filesystems can be used to enable the appearance of writing to otherwise read-only filesystems.

Singularity is similar to other container solutions such as Docker[2] except Singularity was specifically designed to enable containers to be used securely without requiring any special permissions especially on multi-user compute clusters.[3]

Availability[edit]

Singularity is available on Compute Canada clusters (e.g., Cedar and Graham) and some legacy cluster systems run by various Compute Canada involved members/consortia across Canada.

Should you wish to use Singularity on your own computer systems, you will need to download and install Singularity per its documentation.[4] You should be using a relatively recent version of some Linux distribution (e.g., ideally your kernel is v3.10.0 or newer).

Singularity on Compute Canada systems[edit]

Loading a module[edit]

To use Singularity, first load the specific module you would like to use, e.g.,

$ module load singularity/2.5

Should you need to see all versions of Singularity modules that are available then run:

$ module spider singularity

Creating images[edit]

Before using Singularity, you will first need to create a (container) image. A Singularity image is either a file or a directory containing an installation of Linux. One can create a Singularity image using the singularity build command, e.g.,

  • singularity build WHAT-TO-WRITE SPECIAL-URI-OR-PATH

where WHAT-TO-WRITE is:

  • a filename of the singularity image file (*.sif) where the built image will be written
  • a directory if one is building a sandbox using the --sandbox option typically on one's own (Linux) computer (requiring root account access)

and SPECIAL-URI-OR-PATH is:

  • a URI starting with library:// to build from a Container Library,
  • a URI starting with docker:// to build from Docker Hub,
  • a URI starting with shub:// to build from Singularity Hub,
  • a path to an existing container,
  • a path to a directory (to build from a sandbox), or
  • a path to a Singularity image file (which is a recipe on how to build the image).

Advice on Creating Images[edit]

You are strongly advised to create Singularity images using a computer or virtual machine that runs Linux, has Singularity installed, has Internet access, and ideally where you also have root, e.g., sudo, permissions.

If you do not have root permissions to create your image(s) then realize the following:

  • all permissions inside the image will be set to be that of the account (user and group) the image is made under;
  • depending on the contents of the image, you may or may not be able to upgrade it later due to this; and,
  • it is not, in general, possible to easily reverse this.

For example, Debian and Ubuntu images' dpkg, apt-get, and apt commands all require root to upgrade/install packages. Thus, if the ability to install and upgrade software in the future is important, then create the image on a Linux system where you have root permissions.

NOTE: Any image you create on your own computer needs to be uploaded to the cluster before you can use that image.

Creating Images on Compute Canada Clusters[edit]

If you decide to create an image on a Compute Canada cluster, be aware of the fact that you will never have sudo access and so the caveats of the previous section apply. Images can be created on any of the Compute Canada clusters or visualization computers, e.g., gra-vdi.computecanada.ca. Our image creation advice differs depending on which machine you use:

  • beluga.computecanada.ca: Connect using SSH. Use a login node to create the image.
  • cedar.computecanada.ca: Connect using SSH Create the image in an interactive job. Do not use a login node.
  • graham.computecanada.ca: Connect using SSH Use a login node to create the image.
  • gra-vdi.computecanada.ca: Connect using VNC. Use a terminal window to create the image.
  • niagara.computecanada.ca: Connect using SSH Use a login node to create the image.
    • IMPORTANT: Do not bind to /localscratch on niagara as it does not exist!

Of these systems, gra-vdi.computecanada.ca is the best system to create images. The other systems have issues such as:

  • beluga, graham, and niagara:
    • fix the maximum amount of RAM that can be used on login nodes, and,
    • there is no Internet access on compute nodes.
  • cedar:
    • login nodes cannot be successfully used, and,
    • compute node jobs require specifying the amount of RAM needed but this is very difficult to know what is required in advance, thus, if an error occurs, exit the interactive job, and try again requesting more memory.

Creating an image using Docker Hub[edit]

Docker Hub provides an interface to search for images.

Suppose the Docker Hub URL for a container you want is docker://ubuntu, then you would download the container by running:

$ singularity build myubuntuimage.sif docker://ubuntu

Is sudo needed or not needed?[edit]

The Singularity documentation on building a container uses the sudo command. This is because, in general, many uses of the build command requires root, i.e., superuser, access on the system it is run. On Compute Canada systems, regular users do not have root account access so the sudo command cannot be used. If you are building a pre-built image from Singularity or Docker Hub, you typically will not need sudo access. If you do need root access to build an image, then you will either need to ask Compute Canada persons for help, or, install Linux and Singularity on your own computer to have root account access.

Many users don't need to use the sudo command to build their images from Singularity or Docker Hub. If sudo is not used, then the following will happen when you build the image:

  • Singularity will output a warning that such may result in an image that does not work. This message is only a warning though --the image will still be created.
  • All filesystem permissions will be collapsed to be the permissions of the Linux user and group that is running singularity build. (This is normally the user and group you are logged in as.)

Typically one will not need to be concerned with retaining all filesystem permissions unless:

  • one needs to regularly update/reconfigure the contents of the image, and,
  • tools used to update/reconfigure the contents of the image require those permissions to be retained.

For example, many Linux distributions make it easy to update or install new software using commands such as:

  • apt-get update && apt-get upgrade
  • apt-get install some-software-package
  • yum install some-software-package
  • dnf install some-software-package
  • etc.

It is possible that these and other commands may not run successfully unless filesystem permissions are retained.

Sometimes image creation will fail due to various user restrictions placed on the node you are using. The login nodes, in particular, have a number of restrictions which may prevent one from successfully building an image. If this is an issue, then request assistance to create the Singularity image by contacting Technical support.

Using Singularity[edit]

NOTE: The discussion below does not describe how to use Slurm to run interactive or batch jobs --it only describes how to use Singularity. For interactive and batch job information see the Running jobs page.

Unlike perhaps when you created your Singularity image, you will never use, don't need to use, and cannot use sudo to run programs in your image on Compute Canada systems. There are a number of ways to run programs in your image:

  1. Running commands interactively in one Singularity session.
  2. Run a single command which executes and then stops running.
  3. Run a container instance in order to run daemons which may have backgrounded processes.

Running commands interactively[edit]

Singularity can be used interactively by using its shell command, e.g.,

$ singularity shell --help

will give help on shell command usage. The following:

$ singularity shell -B /home -B /project -B /scratch -B /localscratch myimage.simg

will do the following within the container image myimage.simg:

  • bind mount /home so that all home directories can be accessed (subject to your account's permissions)
  • bind mount /project so that project directories can be accessed (subject to your account's permissions)
  • bind mount /scratch so that the scratch directory can be accessed (subject to your account's permissions)
  • bind mount /localscratch so that the localscratch directory can be accessed (subject to your account's permissions)
  • run a shell (e.g., /bin/bash)

If this command is successful, you can interactively run commands from within your container while still being able to access your files in home, project, scratch, and localscratch. :-)

  • NOTE: When done, type "exit" to exit the shell.

In some cases, you will not want the pollution of shell environment variables from your Compute Canada shell. You can run a "clean environment" shell by adding a -e option, e.g.,

$ singularity shell -e -B /home -B /project -B /scratch -B /localscratch myimage.simg

but know you may need to define some shell environment variables such as $USER.

Finally, if you are using Singularity interactively on your own machine, in order for your changes to the image to be written to the disk, you must:

  • be using a Singularity "sandbox" image (i.e., be using a directory not the read-only .simg file)
  • be using the -w option, and,
  • be using sudo

e.g., first create your sandbox image:

$ sudo singularity build -s myimage-dir myimage.simg

and then engage with Singularity interactively:

$ sudo singularity shell -w myimage-dir

When done, you can build a new/updated simg file, with the command:

$ sudo singularity build myimage-new.simg myimage-dir/

and upload myimage-new.simg to a cluster in order to use it.

Running a single command[edit]

When submitting jobs that invoke commands in Singularity containers, one will either use Singularity's exec or run commands.

  • The exec command does not require any configuration.
  • The run command requires configuring an application within a Singularity recipe file and this is not discussed on this page.

The Singularity exec command's options are almost identical to the shell command's options, e.g.,

$ singularity exec --help

When not asking for help, the exec command runs the command you specify within the container and then leaves the container, e.g.,

$ singularity exec -B /home -B /project -B /scratch -B /localscratch myimage.simg ls /

which will output the contents of the root directory within the container. The version of ls is the one installed within the container! For example, should GCC's gcc be installed in the myimage.simg container, then this command:

$ singularity exec -B /home -B /project -B /scratch -B /localscratch myimage.simg gcc -v

will output the version information of what is installed within the container whereas running at the normal Compute Canada shell prompt:

$ gcc -v

will output the version of GCC currently loaded on Compute Canada systems.

If you need to run a single command from within your Singularity container in a job, then the exec command will suffice. Remember to bind mount the directories you will need access to in order for your job to run successfully.

Running container instances[edit]

Should you need to run daemons and backgrounded processes within your container, then do not use the Singularity exec command! Instead you want to use Singularity's instance.start and instance.stop commands to create and destroy sessions (i.e., container instances). By using sessions, Singularity will ensure that all programs running within the instance are terminated when your job ends, unexpectedly dies, is killed, etc.

To start a Singularity session instance, decide on a name for this session, e.g., quadrat5run, and run the instance.start command specifying the image name, e.g., myimage.simg, and your session name:

$ singularity instance.start myimage.simg quadrat5run

A session (and all associated programs that are running) can be stopped (i.e., destroyed/killed) by running the instance.stop command, e.g.,

$ singularity instance.stop myimage.simg quadrat5run

At any time you can obtain a list of all sessions you currently have running by running:

$ singularity instance.list

which will list the daemon name, its PID, and the path to the container's image.

With a session started, programs can be run using Singularity's shell, exec, or run commands by specifying the name of the session immediately after the image name prefixed with instance://, e.g.,

$ singularity instance.start mysessionname
$ singularity exec myimage.simg instance://mysessionname ps -eaf
$ singularity shell myimage.simg instance://mysessionname 
nohup find / -type d >dump.txt
exit
$ singularity exec myimage.simg instance://mysessionname ps -eaf
$ singularity instance.stop mysessionname

Bind mounts[edit]

When running a program within a Singularity container, by default, it can only see the files within the container image and the current directory. Realistically your Singularity jobs will need to mount the various filesystems where your files are. This is done using the -B option to the Singularity shell, exec, or run commands, e.g.,

$ singularity shell -B /home -B /project -B /scratch -B /localscratch myimage.simg
$ singularity exec -B /home -B /project -B /scratch -B /localscratch myimage.simg ls /
$ singularity run -B /home -B /project -B /scratch -B /localscratch myimage.simg some-program

The previous three commands show how to bind mount the various filesystems on Compute Canada's systems, i.e., within the container image myimage.simg these commands bind mount:

  • /home so that all home directories can be accessed (subject to your account's permissions)
  • /project so that project directories can be accessed (subject to your account's permissions)
  • /scratch so that the scratch directory can be accessed (subject to your account's permissions)
  • /localscratch so that the localscratch directory can be accessed (subject to your account's permissions)

In most cases, it is not recommended to directly mount each directory you need as this can cause access issues. Instead, mount the top directory of the filesystem as shown above.


HPC issues[edit]

Running MPI programs from within a container[edit]

If you are running MPI programs nothing special needs to be done for jobs running on a single node.

To run jobs across nodes with MPI requires:

  • Ensuring your MPI program is compiled using the OpenMPI installed inside your Singularity container.
    • Ideally the version of OpenMPI inside the container is version 3 or 4. Version 2 may or may not work. Version 1 will not work.
  • Ensuring your SLURM job script uses srun to run the MPI program. Do not use mpirun or mpiexec, e.g.,
srun singularity exec /path/to/your/singularity/image.sif /path/to/your-program
  • Ensure there are no module load commands in your job script.
  • Before submitting the job using sbatch, in the CC shell environment, module load the following:
    • singularity
    • openmpi (This does not need to match the OpenMPI version installed inside the container. Ideally use version 4 or version 3; version 2 may or may not work; version 1 will not work.)

Finally, ensure that a high performance interconnect package is also installed in your image, i.e.,

  • clusters using OmniPath need libpsm2, and,
  • clusters using Infiniband need UCX.


Changing Singularity Default Directories[edit]

You can override Singularity's default temporary and cache directories by setting these environment variables before running singularity:

  • SINGULARITY_CACHEDIR: the directory where singularity will download (and cache) files
  • SINGULARITY_TMPDIR: the directory where singularity will write temporary files including when building (squashfs) images

For example, to tell singularity to use your scratch space for its cache and temporary files, one might run:

$ mkdir -p /scratch/$USER/singularity/{cache,tmp}
$ export SINGULARITY_CACHEDIR="/scratch/$USER/singularity/cache"
$ export SINGULARITY_TMPDIR="/scratch/$USER/singularity/tmp"

before running singularity.

See also[edit]

References[edit]

  1. Singularity Software website: https://www.sylabs.io/docs/
  2. Docker Software website: https://www.docker.com/
  3. Singularity Security Documentation: https://www.sylabs.io/guides/2.5.1/admin-guide/security.html
  4. Singularity Documentation: https://www.sylabs.io/docs/