Biorepo containers
This is not a complete article: This is a draft, a work in progress that is intended to be published into an article, which may or may not be ready for inclusion in the main wiki. It should not necessarily be considered factual or authoritative.
Introduction
Containers as a method of distributing software along with their often complex dependencies has recently become commonplace thanks to the rising popularity of container managements systems and runtime environments such as Docker and Singularity, now called Apptainer. Users on the Alliance system are free to use containers of their own in order to bring their software to the Alliance systems. While many users will simply build containers and store them in their own project spaces, groups with many users, those who wish to run the same container on many systems, or those who need to preserve a complex workflow for some period of time may choose to deposit their container image into the CVMFS repository described below. For container authors and contributors, this article outlines the requirements containers need to meet to be hosted in the repository. For users of these containers, this article will outline available containers and some information on their use.
Container repository
The Research Software National Team (RSNT) and the Bioinformatics National Team (BNT) operate a container repository to store Apptainer images in CVMFS to make these containers available on all Alliance systems. The BNT maintains the portion of that repository of interest to bioinformatics-focused users. This not only reduces the effort to run workflows across our infrastructure, but allows for the hard work of creating workflows to be leveraged by others in the community while enabling a high degree of reproducibility. Worth noting is the fact these container images will be available to all users on Alliance systems and container authors must take this into account when considering this service for workflows they wish to keep private.
We support Apptainer (formerly Singularity) containers on our systems. Some examples of supported use cases are:
- Containers with mature workflows, especially for the clinical studies, that are needed for long-term projects (3-5 years). These types of containers can provide reproducible workflows and tools commonly used in this type of research.
- Containers that require Conda and can not easily be ported to use pip and our modules. This type of container can be used when the desired tools must be or are much more easily installed through Conda or a similar running environment.
- Containers with most of the tools, packages or libraries installed for a certain workflow, such as scRNA-seq. This type of container has the commonly used tools as well as the prerequisite libraries and system files which make it easier to update the version of the desired tools or install new ones.
Additionally, all applications in a container must be open source and governed by a license that allows us to distribute the software in this manner. Only the sandbox format of container images (i.e. an unpacked directory) should be published on CVMFS, not .sif files.
Contact our Technical support for more details or any follow up questions.
If you have a container image you would like to incorporate into this system, please provide us with the following information:
- An Apptainer or Singularity recipe file along with a brief description of the container to be made available on the wiki page. Ensure that the recipe is reproducible: pin the version of software being installed as much as possible.
- Instructions and/or examples of usage if your container implements a complex workflow rather than simply making a piece of software available.
- The audience for this image: Containers that we are intended to have a reach beyond their originating contributor. What is the number of research groups/PIs involved and the number of users who would be using the image?
- The justification for this image: Our primary way of deploying software is through modules and python wheels. Why can this application not be provided as a software module or python wheel?
Note that there may be additional metadata and deployment processes needed to accommodate security requirements as set out by the Alliance security team. These requirements and any other security remediations will be enumerated in the governing policy document for the container repository which will be linked here once published.
Image Name | Description | Contributed By |
timsconvert_v1.0.0 | Convert raw Bruker timsTOF Pro and fleX MS data formats to open source data formats | Jean-Francois Lucier |
ncov-tools | QC pipeline on coronavirus sequencing results | Jose Hector Galvez |
genpipes | Bioinformatics analysis pipeline | Jose Hector Galvez |