Apptainer: Difference between revisions

Line 46: Line 46:
* Install Linux, Apptainer, and <code>sudo</code> in a virtual machine on a system you control so you will be able to have <code>sudo</code> access within such. Build your image(s) on that machine and upload them in order to use them on Alliance systems.
* Install Linux, Apptainer, and <code>sudo</code> in a virtual machine on a system you control so you will be able to have <code>sudo</code> access within such. Build your image(s) on that machine and upload them in order to use them on Alliance systems.
* If appropriate, [[Technical Support|submit a ticket]] asking if Alliance staff would be able to help build the image(s), etc. required needing <code>sudo</code>. (Understand that this may or may not be done/possible --but feel free to ask such in a ticket if what you wish to achieve is beyond your means. Additionally, we may respond with other ways to achieve such with may or may not involve Apptainer.)
* If appropriate, [[Technical Support|submit a ticket]] asking if Alliance staff would be able to help build the image(s), etc. required needing <code>sudo</code>. (Understand that this may or may not be done/possible --but feel free to ask such in a ticket if what you wish to achieve is beyond your means. Additionally, we may respond with other ways to achieve such with may or may not involve Apptainer.)
NOTE: Apptainer version 1.1, which as of August 2022 has release candidate status only, may well allow users to easily use the <code>--fakeroot</code> option with various Apptainer commands. Such should make it possible to do various things without <code>sudo</code>. If so, after Apptainer version 1.1 has been released, this web page will be updated concerning such.


====Important Command Line Options====
====Important Command Line Options====
Line 67: Line 65:
|}
|}


There is another important option, <code>-W</code> or <code>--workdir</code>, one should consider using. On Alliance clusters and on most Linux systems, <code>/tmp</code> and similar filesystems use RAM --not disk space. Since jobs typically run on our clusters with limited RAM amounts, this can easily result in jobs getting killed because they consume too much RAM relative to what was requested for the job. A suitable work-around for this is to tell Apptainer to use actual disk space location for its "workdir". This is done by passing the <code>-W</code> option followed by a path to a disk space location where Apptainer can read/write temporary files, etc. For example, suppose one wanted to run a command <code>myprogram</code> in a using an Apptainer container image called <code>myimage.sif</code> with its "workdir" set to <code>$HOME/aworkdir</code> in the filesystem:
Another important option one should consider and may need to use Apptainer successfully is the <code>-W</code> or <code>--workdir</code> option. On Alliance clusters and on most Linux systems, <code>/tmp</code> and similar filesystems use RAM --not disk space. Since jobs typically run on our clusters with limited RAM amounts, this can result in jobs getting killed because they consume too much RAM relative to what was requested for the job. A suitable work-around for this is to tell Apptainer to use a real disk space location for its "workdir". This is done by passing the <code>-W</code> option followed by a path to a disk space location where Apptainer can read/write temporary files, etc. For example, suppose one wanted to run a command <code>myprogram</code> in a using an Apptainer container image called <code>myimage.sif</code> with its "workdir" set to <code>/path/to/a/workdir</code> in the filesystem:


<source lang="console">$ mkdir -p $HOME/aworkdir
<source lang="console">$ mkdir -p $HOME/aworkdir
$ apptainer run -C -B /home -W $HOME/aworkdir myimage.sif myprogram</source>
$ apptainer run -C -B /home -W /path/to/a/workdir myimage.sif myprogram</source>


where:
where:
* The workdir directory can be removed if there are no live containers using it.
* The workdir directory can be removed if there are no live containers using it.
* When using Apptainer in an <code>salloc</code> or an <code>sbatch</code> job, use <code>${SLURM_TMPDIR}</code> for the "workdir" location, e.g., <code>-W ${SLURM_TMPDIR}</code>.
* When using Apptainer in an <code>salloc</code>, in an <code>sbatch</code> job, or when using [JupyterHub] on our clusters, use <code>${SLURM_TMPDIR}</code> for the "workdir" location, e.g., <code>-W ${SLURM_TMPDIR}</code>.
** ASIDE: One should '''not''' be running programs (including using Apptainer) on a login node: use an interactive <code>salloc</code> job.
** ASIDE: One should '''not''' be running programs (including Apptainer) on a login node: use an interactive <code>salloc</code> job.
* When using bind mounts, see the [[#Bind_Mounts|section on bind mounts]] below since not all Alliance clusters are the same concerning the exact bind mounts that are needed to access <code>/home</code>, <code>/project</code>, and <code>/scratch</code>.
* When using bind mounts, see the [[#Bind_Mounts|section on bind mounts]] below since not all Alliance clusters are all exactly the same concerning the exact bind mounts that are needed to access <code>/home</code>, <code>/project</code>, and <code>/scratch</code>.


====Using GPUs====
====Using GPUs====
cc_staff
156

edits