38,760
edits
(Updating to match new version of source page) |
(Updating to match new version of source page) |
||
Line 34: | Line 34: | ||
* [https://www.docker.com/ Docker] | * [https://www.docker.com/ Docker] | ||
** Using Docker on an multi-user cluster creates security risks, therefore we do not make Docker available on our HPC clusters. | ** Using Docker on an multi-user cluster creates security risks, therefore we do not make Docker available on our HPC clusters. | ||
** You can install and use Docker on your own computer and use it to create an Apptainer image, which can then be uploaded to an HPC cluster as outlined in | ** You can install and use Docker on your own computer and use it to create an Apptainer image, which can then be uploaded to an HPC cluster as outlined in <b>[[#Creating_an_Apptainer_Container_From_a_Dockerfile|this section]]</b> later on this page. | ||
==Other items== | ==Other items== | ||
===General=== | ===General=== | ||
* In order to use Apptainer you must have a container | * In order to use Apptainer you must have a container <b>image</b>, e.g., a <code>.sif</code> file or a "sandbox" directory created previously. If you don't already have an image or a sandbox, see the section on <b>[[#Building_an_Apptainer_Container/Image|building an image]]</b> below. | ||
* While Apptainer is installed and available for use, using Apptainer will require you to install and/or build all software you will need to make use of in your container. In many instances, | * While Apptainer is installed and available for use, using Apptainer will require you to install and/or build all software you will need to make use of in your container. In many instances, <b>[[Available_software|we already have such software installed on our clusters]]</b> so there is often no need to create a container with the same installed in it. | ||
===<code>sudo</code>=== | ===<code>sudo</code>=== | ||
Many users ask about <code>sudo</code> since documentation and | Many users ask about <code>sudo</code> since documentation and websites often discuss using <code>sudo</code>. Know the ability to use <code>sudo</code> to obtain superuser/root permissions is not available on our clusters. Should you require using <code>sudo</code>, consider the following options: | ||
* Install Linux, Apptainer, and <code>sudo</code> in a virtual machine on a system you control so you will be able to have <code>sudo</code> access within such. Build your image(s) on that machine and upload them in order to use them on Alliance systems. | * Install Linux, Apptainer, and <code>sudo</code> in a virtual machine on a system you control so you will be able to have <code>sudo</code> access within such. Build your image(s) on that machine and upload them in order to use them on Alliance systems. | ||
Line 66: | Line 66: | ||
==Important command line options== | ==Important command line options== | ||
Software that is run inside a container runs in a different environment using different libraries and tools than what is installed on the host system. It is, therefore, important to run programs within containers by | Software that is run inside a container runs in a different environment using different libraries and tools than what is installed on the host system. It is, therefore, important to run programs within containers by <b>not</b> using any environment settings or software defined outside of the container. Unfortunately, by default, Apptainer will run adopting the shell environment of the host and this can result in issues when running programs. To avoid such issues, when using <code>apptainer run</code>, <code>apptainer shell</code>, <code>apptainer exec</code>, and/or </code>apptainer instance</code>, use one of these options (with more preference to those options listed earlier in the table below): | ||
{| class="wikitable" | {| class="wikitable" | ||
Line 88: | Line 88: | ||
* The workdir can be removed if there are no live containers using it. | * The workdir can be removed if there are no live containers using it. | ||
* When using Apptainer in an <code>salloc</code>, in an <code>sbatch</code> job, or when using [JupyterHub] on our clusters, use <code>${SLURM_TMPDIR}</code> for the "workdir" location, e.g., <code>-W ${SLURM_TMPDIR}</code>. | * When using Apptainer in an <code>salloc</code>, in an <code>sbatch</code> job, or when using [JupyterHub] on our clusters, use <code>${SLURM_TMPDIR}</code> for the "workdir" location, e.g., <code>-W ${SLURM_TMPDIR}</code>. | ||
** ASIDE: One should | ** ASIDE: One should <b>not</b> be running programs (including Apptainer) on a login node. Use an interactive <code>salloc</code> job. | ||
* When using bind mounts, see the [[#Bind_Mounts|section on bind mounts]] below since not all Alliance clusters are the same concerning the exact bind mounts needed to access <code>/home</code>, <code>/project</code>, and <code>/scratch</code>. | * When using bind mounts, see the [[#Bind_Mounts|section on bind mounts]] below since not all Alliance clusters are the same concerning the exact bind mounts needed to access <code>/home</code>, <code>/project</code>, and <code>/scratch</code>. | ||
Line 183: | Line 183: | ||
as well as the [http://apptainer.org/docs/user/main/index.html official Apptainer documentation]. | as well as the [http://apptainer.org/docs/user/main/index.html official Apptainer documentation]. | ||
<b>IMPORTANT:</b>In addition to choose to use the above options, if you are making use of a persistent overlay image (as a separate file or contained within the SIF file) and want changes to be written to that image, it is extremely important to pass the <code>-w</code> or <code>--writable</code> option to your container. If this option is not passed to it, any changes you make to the image in the <code>apptainer shell</code> session will not be saved! | |||
==Running daemons: <code>apptainer instance</code>== | ==Running daemons: <code>apptainer instance</code>== | ||
Line 190: | Line 190: | ||
Apptainer has been designed to be able to properly run daemons within compute jobs on clusters. Running daemons is achieved, in part, by using <code>apptainer instance</code>. See the [http://apptainer.org/docs/user/main/running_services.html official Apptainer documentation on Running Services] for the details. | Apptainer has been designed to be able to properly run daemons within compute jobs on clusters. Running daemons is achieved, in part, by using <code>apptainer instance</code>. See the [http://apptainer.org/docs/user/main/running_services.html official Apptainer documentation on Running Services] for the details. | ||
<b>NOTE 1:</b> Don't run daemons manually without using <code>apptainer instance</code> and related commands. Apptainer works properly with other tools such as the Slurm scheduler that run on our clusters. When a job is cancelled, killed, crashes, or is otherwise finished, daemons run using <code>apptainer instance</code> will not hang or result in defunct processes. Additionally by using the <code>apptainer instance</code> command you will be able to control the daemons and programs running in the same container. | |||
<b>NOTE 2:</b> Running daemons in your job will only run those daemons while your job runs. The scheduler will kill all processes attached to a job when the granted job time expires. If you need to run daemons continuously for longer than a job, submit a ticket asking to discuss such with staff persons. Such may require creating a cloud virtual machine to achieve such. | |||
==Running MPI programs== | ==Running MPI programs== | ||
Line 198: | Line 198: | ||
Running MPI programs within an Apptainer container across nodes likely will require special configuration. MPI exploits cluster interconnection hardware to communicate amongst nodes much more efficiently. Normally one does not need to worry about this since it is automatically done --except when running MPI programs across cluster nodes. | Running MPI programs within an Apptainer container across nodes likely will require special configuration. MPI exploits cluster interconnection hardware to communicate amongst nodes much more efficiently. Normally one does not need to worry about this since it is automatically done --except when running MPI programs across cluster nodes. | ||
<b>NOTE:</b> When all MPI processes are running on a single shared-memory node, there is no need to use interconnection hardware and there will be no issues running MPI programs within an Apptainer container when all MPI processes run on a single cluster node, e.g., when the slurm option <code>--nodes=1</code> is used with an <code>sbatch</code> script. Unless one <b>explicitly</b> sets the maximum number of cluster nodes used to <code>1</code>, the scheduler can choose to run an MPI program over multiple nodes. If such will run from within an Apptainer container and has not been set up to properly run, then it is possible it will fail to run. | |||
Text to come. | Text to come. | ||
Line 243: | Line 243: | ||
i.e., <code>-B ./my_data_file.txt:/special/input.dat</code> bind mount maps the file <code>./my_data_file.txt</code> to be the file <code>/special/input.dat</code> inside the container and the <code>wc</code> command now processes that file. This feature can be useful when programs/scripts inside the container have hard-coded paths to files and directories that must be located in certain locations. | i.e., <code>-B ./my_data_file.txt:/special/input.dat</code> bind mount maps the file <code>./my_data_file.txt</code> to be the file <code>/special/input.dat</code> inside the container and the <code>wc</code> command now processes that file. This feature can be useful when programs/scripts inside the container have hard-coded paths to files and directories that must be located in certain locations. | ||
Finally, | Finally, <b>don't mount our CVMFS paths</b> inside your containers as this is fraught with perils and defeats many reasons to use a container. The programs needed to be run inside a container need to be completely inside the container --don't introduce even more programs inside the container that don't need to be inside the container. | ||
==Persistent overlays== | ==Persistent overlays== | ||
Line 252: | Line 252: | ||
==Overview== | ==Overview== | ||
<b>NOTE:</b> Please note and heed the advice concerning building images/overlays given <b>[[#Building_Images.2FOverlays|earlier on this page]]</b> . | |||
Apptainer "images" can be created in the following formats: | Apptainer "images" can be created in the following formats: | ||
Line 284: | Line 284: | ||
==Building a SIF image== | ==Building a SIF image== | ||
<b>NOTE:</b> Please note and heed the advice concerning building images/overlays given <b> [[#Building_Images.2FOverlays|earlier on this page]]</b> . | |||
To build an Apptainer SIF file image from Docker's latest available busybox image, use the <code>apptainer build</code> command: | To build an Apptainer SIF file image from Docker's latest available busybox image, use the <code>apptainer build</code> command: | ||
Line 293: | Line 293: | ||
==Building a sandbox image== | ==Building a sandbox image== | ||
<b>NOTE:</b> Please note and heed the advice concerning building images/overlays given <b>[[#Building_Images.2FOverlays|earlier on this page]]</b> . | |||
In order to build a "sandbox" directory instead of an <code>SIF</code> file instead of providing an <code>SIF</code> file name, instead provide <code>--sandbox DIR_NAME</code> or <code>-s DIR_NAME</code> where <code>DIR_NAME</code> is the name of the to-be-created-directory where you want your "sandbox" image. For example, if the <code>apptainer build</code> command to create an <code>SIF</code> file was: | In order to build a "sandbox" directory instead of an <code>SIF</code> file instead of providing an <code>SIF</code> file name, instead provide <code>--sandbox DIR_NAME</code> or <code>-s DIR_NAME</code> where <code>DIR_NAME</code> is the name of the to-be-created-directory where you want your "sandbox" image. For example, if the <code>apptainer build</code> command to create an <code>SIF</code> file was: | ||
Line 327: | Line 327: | ||
==Creating an Apptainer container from a Dockerfile== | ==Creating an Apptainer container from a Dockerfile== | ||
<b>NOTE: This section requires you to install and use Docker and Apptainer on a system where you have appropriate privileges. These instructions will <i>not</i> work on our compute clusters.</b> | |||
Unfortunately some instructions for packages only provide a <code>Dockerfile</code> without a container image. A <code>Dockerfile</code> contains the instructions necessary for the Docker software to build that container. Our clusters do not have the Docker software installed. That said, if you've access to a system with both Docker and Apptainer installed, and, sufficient access to Docker (e.g., <code>sudo</code> or root access, or, you are in that system's <code>docker</code> group) and if needed Apptainer (e.g., <code>sudo</code> or root access, or, you have <code>--fakeroot</code> access), then you can follow the instructions below to use Docker and then Apptainer to build an Apptainer image on that system. | Unfortunately some instructions for packages only provide a <code>Dockerfile</code> without a container image. A <code>Dockerfile</code> contains the instructions necessary for the Docker software to build that container. Our clusters do not have the Docker software installed. That said, if you've access to a system with both Docker and Apptainer installed, and, sufficient access to Docker (e.g., <code>sudo</code> or root access, or, you are in that system's <code>docker</code> group) and if needed Apptainer (e.g., <code>sudo</code> or root access, or, you have <code>--fakeroot</code> access), then you can follow the instructions below to use Docker and then Apptainer to build an Apptainer image on that system. | ||
<b>NOTE:</b> Using Docker may fail if you are not in the <code>docker</code> group. Similarly, building some containers may fail with Apptainer without appropriate <code>sudo</code>, root, or <code>--fakeroot</code> permissions. It is your responsibility to ensure you've such access on the system you are running the commands below. | |||
If one only has a Dockerfile and wishes to create an Apptainer image, run the following on a computer with Docker and Apptainer installed (where you've sufficient permissions, etc.): | If one only has a Dockerfile and wishes to create an Apptainer image, run the following on a computer with Docker and Apptainer installed (where you've sufficient permissions, etc.): | ||
Line 350: | Line 350: | ||
After this is done, the SIF file is an Apptainer container for the <code>Dockerfile</code>. Transfer the SIF to the approprate cluster(s) in order to use such. | After this is done, the SIF file is an Apptainer container for the <code>Dockerfile</code>. Transfer the SIF to the approprate cluster(s) in order to use such. | ||
<b>NOTE:</b> It is possible that the Dockerfile pulled in more layers which means you will have to manually delete those additional layers by running: | |||
docker images | docker images | ||
Line 356: | Line 356: | ||
followed by runninng <code>docker image rm ID</code> (where ID is the image ID output from the <code>docker images</code> command) in order to free up the disk space associated with those other image layers on the system you are using. | followed by runninng <code>docker image rm ID</code> (where ID is the image ID output from the <code>docker images</code> command) in order to free up the disk space associated with those other image layers on the system you are using. | ||
=Miscellaneous | =Miscellaneous items= | ||
==Cleaning Apptainer's cache directory== | ==Cleaning Apptainer's cache directory== |