Nextflow: Difference between revisions
No edit summary |
No edit summary |
||
Line 219: | Line 219: | ||
nextflow run nf-core-${NFCORE_PL}-${PL_VERSION}/workflow -profile test,singularity,beluga --outdir ${NFCORE_PL}_OUTPUT | nextflow run nf-core-${NFCORE_PL}-${PL_VERSION}/workflow -profile test,singularity,beluga --outdir ${NFCORE_PL}_OUTPUT | ||
</source> | </source> | ||
Be | Be careful if you have an AWS configuration in your <code>~/.aws</code> directory, as Nextflow might complain that it can't dowload the pipeline test dataset with your default id. | ||
So now you have started the Nextflow sub-scheduler on the login node. This process sends jobs to Slurm when they are ready to be processed. | So now you have started the Nextflow sub-scheduler on the login node. This process sends jobs to Slurm when they are ready to be processed. |
Revision as of 14:29, 1 May 2024
Nextflow is software for running reproducible scientific workflows. The term Nextflow is used to describe both the domain-specific-language (DSL) the pipelines are written in, and the software used to interpret those workflows.
Usage
On our systems, Nextflow is provided as a module you can load with module load nextflow
.
While you can build your own workflow, you can also rely on the published nf-core pipelines. We will describe here a simple configuration that will let you run nf-core pipelines on our systems and help you to configure Nextflow properly for your own pipelines.
Our example uses the nf-core/smrnaseq
pipeline.
Installation
The following procedure is to be run on a login node.
Start by installing a pip package to help with the setup; please note that the nf-core tools can be slow to install.
module purge # we make sure that some previously loaded package are not polluting the installation
module load python/3.11
module load postgresql # Will not use postgress here, but some python module list psycopg2 as a dependecy in the installation would crash.
python -m venv nf-core-env
source nf-core-env/bin/activate
python -m pip install nf_core==2.13
Set the name of the pipeline to be tested, and load Nextflow and Apptainer. (Apptainer is the successor to the Singularity container utility.) Nextflow integrates well with Apptainer.
export NFCORE_PL=smrnaseq
export PL_VERSION=2.3.1
module load nextflow/23
module load apptainer/1
An important step is to download all the Apptainer images that will be used to run the pipeline at the same time we download the workflow itself. If this isn't done, Nextflow will try to download the images from the compute nodes, just before steps are executed. This would not work on most of our clusters since there is no internet connection on the compute nodes.
Create a folder where images will be stored and set the environment variable NXF_SINGULARITY_CACHEDIR
to it. Workflow images tend to be big, so do not store them in your $HOME space because it has a small quota. Instead, store them in /project
space.
mkdir /project/<def-group>/NXF_SINGULARITY_CACHEDIR
export NXF_SINGULARITY_CACHEDIR=/project/<def-group>/NXF_SINGULARITY_CACHEDIR
You should share this folder with other members of your group who are planning to use Nextflow with Apptainer, in order to reduce duplication.
Also, you may add the export
command to your ~/.bashrc
as a convenience.
The following command downloads the smrnaseq
pipeline to your /scratch
directory and puts all the images in the cache directory.
cd ~/scratch
nf-core download --singularity-cache-only --container singularity --compress none -r ${PL_VERSION} -p 6 ${NFCORE_PL}
# Alteratively, you can run the download tool in interactive mode
nf-core download
# Here is what your singularity image cache will look like after completeion:
$ls $NXF_SINGULARITY_CACHEDIR/
depot.galaxyproject.org-singularity-bioconvert-1.1.1--pyhdfd78af_0.img
depot.galaxyproject.org-singularity-blat-36--0.img
depot.galaxyproject.org-singularity-bowtie-1.3.1--py310h7b97f60_6.img
depot.galaxyproject.org-singularity-bowtie2-2.4.5--py39hd2f7db1_2.img
depot.galaxyproject.org-singularity-fastp-0.23.4--h5f740d0_0.img
depot.galaxyproject.org-singularity-fastqc-0.12.1--hdfd78af_0.img
depot.galaxyproject.org-singularity-fastx_toolkit-0.0.14--hdbdd923_11.img
depot.galaxyproject.org-singularity-mirdeep2-2.0.1.3--hdfd78af_1.img
depot.galaxyproject.org-singularity-mirtrace-1.0.1--hdfd78af_1.img
depot.galaxyproject.org-singularity-mulled-v2-0c13ef770dd7cc5c76c2ce23ba6669234cf03385-63be019f50581cc5dfe4fc0f73ae50f2d4d661f7-0.img
depot.galaxyproject.org-singularity-mulled-v2-419bd7f10b2b902489ac63bbaafc7db76f8e0ae1-f5ff7de321749bc7ae12f7e79a4b581497f4c8ce-0.img
depot.galaxyproject.org-singularity-mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3-40128b496751b037e2bd85f6789e83d4ff8a4837-0.img
depot.galaxyproject.org-singularity-multiqc-1.21--pyhdfd78af_0.img
depot.galaxyproject.org-singularity-pigz-2.3.4.img
depot.galaxyproject.org-singularity-r-data.table-1.12.2.img
depot.galaxyproject.org-singularity-samtools-1.19.2--h50ea8bc_0.img
depot.galaxyproject.org-singularity-seqcluster-1.2.9--pyh5e36f6f_0.img
depot.galaxyproject.org-singularity-seqkit-2.6.1--h9ee0642_0.img
depot.galaxyproject.org-singularity-ubuntu-20.04.img
depot.galaxyproject.org-singularity-umicollapse-1.0.0--hdfd78af_1.img
depot.galaxyproject.org-singularity-umi_tools-1.1.5--py39hf95cd2a_0.img
quay.io-singularity-bioconvert-1.1.1--pyhdfd78af_0.img
quay.io-singularity-blat-36--0.img
quay.io-singularity-bowtie-1.3.1--py310h7b97f60_6.img
quay.io-singularity-bowtie2-2.4.5--py39hd2f7db1_2.img
quay.io-singularity-fastp-0.23.4--h5f740d0_0.img
quay.io-singularity-fastqc-0.12.1--hdfd78af_0.img
quay.io-singularity-fastx_toolkit-0.0.14--hdbdd923_11.img
quay.io-singularity-mirdeep2-2.0.1.3--hdfd78af_1.img
quay.io-singularity-mirtrace-1.0.1--hdfd78af_1.img
quay.io-singularity-mulled-v2-0c13ef770dd7cc5c76c2ce23ba6669234cf03385-63be019f50581cc5dfe4fc0f73ae50f2d4d661f7-0.img
quay.io-singularity-mulled-v2-419bd7f10b2b902489ac63bbaafc7db76f8e0ae1-f5ff7de321749bc7ae12f7e79a4b581497f4c8ce-0.img
quay.io-singularity-mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3-40128b496751b037e2bd85f6789e83d4ff8a4837-0.img
quay.io-singularity-multiqc-1.21--pyhdfd78af_0.img
quay.io-singularity-pigz-2.3.4.img
quay.io-singularity-r-data.table-1.12.2.img
quay.io-singularity-samtools-1.19.2--h50ea8bc_0.img
quay.io-singularity-seqcluster-1.2.9--pyh5e36f6f_0.img
quay.io-singularity-seqkit-2.6.1--h9ee0642_0.img
quay.io-singularity-ubuntu-20.04.img
quay.io-singularity-umicollapse-1.0.0--hdfd78af_1.img
quay.io-singularity-umi_tools-1.1.5--py39hf95cd2a_0.img
singularity-bioconvert-1.1.1--pyhdfd78af_0.img
singularity-blat-36--0.img
singularity-bowtie-1.3.1--py310h7b97f60_6.img
singularity-bowtie2-2.4.5--py39hd2f7db1_2.img
singularity-fastp-0.23.4--h5f740d0_0.img
singularity-fastqc-0.12.1--hdfd78af_0.img
singularity-fastx_toolkit-0.0.14--hdbdd923_11.img
singularity-mirdeep2-2.0.1.3--hdfd78af_1.img
singularity-mirtrace-1.0.1--hdfd78af_1.img
singularity-mulled-v2-0c13ef770dd7cc5c76c2ce23ba6669234cf03385-63be019f50581cc5dfe4fc0f73ae50f2d4d661f7-0.img
singularity-mulled-v2-419bd7f10b2b902489ac63bbaafc7db76f8e0ae1-f5ff7de321749bc7ae12f7e79a4b581497f4c8ce-0.img
singularity-mulled-v2-ffbf83a6b0ab6ec567a336cf349b80637135bca3-40128b496751b037e2bd85f6789e83d4ff8a4837-0.img
singularity-multiqc-1.21--pyhdfd78af_0.img
singularity-pigz-2.3.4.img
singularity-r-data.table-1.12.2.img
singularity-samtools-1.19.2--h50ea8bc_0.img
singularity-seqcluster-1.2.9--pyh5e36f6f_0.img
singularity-seqkit-2.6.1--h9ee0642_0.img
singularity-ubuntu-20.04.img
singularity-umicollapse-1.0.0--hdfd78af_1.img
singularity-umi_tools-1.1.5--py39hf95cd2a_0.img
This workflow downloads 18 containers for a total of about 4Go and creates an nf-core-${NFCORE_PL}-${PL_VERSION}
folder with the workflow
and config
subfolders. The config
subfolder includes the institutional configuration while the workflow itself is in the workflow
subfolder.
This is what a typical nf-core pipeline looks like:
$ $ls nf-core-${NFCORE_PL}_${PL_VERSION}/2_3_1
assets CITATIONS.md docs modules nextflow_schema.json subworkflows
bin CODE_OF_CONDUCT.md LICENSE modules.json pyproject.toml tower.yml
CHANGELOG.md conf main.nf nextflow.config README.md workflows
When the pipeline is launched, Nextflow will look at the nextflow.config
file and also at the ~/.nextflow/config
file (if it exists) to control how to run the workflow. The nf-core pipelines all have a default configuration, a test configuration, and container configurations (singularity, podman, etc). You will need to provide a custom configuration for the cluster you are running on (Narval, Béluga, Cedar or Graham), a simple configuration is provided in next section. Nextflow pipelines could also run on Niagara if they were designed with that specific cluster in mind, but we generally discourage you to run nf-core or any other generic Nextflow pipeline on Niagara.
A configuration for our clusters
You can use the following config by changing the default value for nf-core processes and enter the correct information for the Béluga and Narval clusters. This config is saved in a profile block that will be loaded at runtime.
params {
config_profile_description = 'Alliance HPC config'
config_profile_contact = 'support@alliancecan.ca'
config_profile_url = 'docs.alliancecan.ca/mediawiki/index.php?title=Nextflow'
}
singularity {
enabled = true
autoMounts = true
}
apptainer {
autoMounts = true
}
process {
executor = 'slurm'
clusterOptions = '--account=<my-account>'
maxRetries = 1
errorStrategy = { task.exitStatus in [125,139] ? 'retry' : 'finish' }
memory = { check_max( 4.GB * task.attempt, 'memory' ) }
cpu = 1
time = '3h'
}
executor {
pollInterval = '60 sec'
submitRateLimit = '60/1min'
queueSize = 100
}
profiles {
beluga {
max_memory='186G'
max_cpu=40
max_time='168h'
}
narval {
max_memory='249G'
max_cpu=64
max_time='168h'
}
}
Replace <my-account>
with your own account, which looks like def-pname
.
The singularity.autoMounts = true
bits ensure that all the cluster File Systems (/project
, /scratch
, /home
& /localscratch
) will be properly mounted inside the singularity container.
This configuration ensures that there are no more than 100 jobs in the Slurm queue and that only 60 jobs are submitted per minute. It indicates that Béluga machines have 40 cores and 186G of RAM with a maximum walltime of one week (168 hours).
The config is linked to the system you are running on, but it is also related to the pipeline itself. For example, here cpu = 1 is the default value, but steps in the pipeline can have more than that. This can get quite complicated and labels in the nf-core-smrnaseq_2.3.1/2_3_1/conf/base.config
file are used internally by the pipeline to identify a step with a non default configuration. We do not vover this more advance topic here, but note that tweaking these labels could make a big difference in the queuing and execution time of your pipeline.
Whith the errorStrategy.errorStrategy
and errorStrategy.memory
lines, this configuration also a default restart behavior that automatically adds 4GB or RAM on failed steps that have ret code 125 (out of memory) or 139 (omm killed because the process used more memory than what was allowed by cgroup).
Running the pipeline
Use the two profiles provided by nf-core (test and singularity) and the profile we have just created for Béluga. Note that Nextflow is mainly written in Java which tends to use a lot of virtual memory. On the Narval cluster that won't be a problem, but with the Béluga login node, you will need to change the virtual memory to run most workflows. To set the virtual memory limit to 40G, use the ulimit -v 40000000
command. We also used a terminal multiplexer, so the Nextflow pipeline will still run if you are disconnected and you will be able to reconnect to the controller process. Note that running Nextflow on login nodes is easy on Béluga and Naval, but harder on Graham and Cedar since the login node virtual memory limit cannot be changed on these clusters; we recommend launching Nextflow from a compute node, where the virtual memory is never limited.
nextflow run nf-core-${NFCORE_PL}-${PL_VERSION}/workflow -profile test,singularity,beluga --outdir ${NFCORE_PL}_OUTPUT
Be careful if you have an AWS configuration in your ~/.aws
directory, as Nextflow might complain that it can't dowload the pipeline test dataset with your default id.
So now you have started the Nextflow sub-scheduler on the login node. This process sends jobs to Slurm when they are ready to be processed.
You can see the progression of the pipeline. You can also open a new session on the cluster or detach from the tmux session to have a look at the jobs in the Slurm queue with squeue -u $USER
or sq
Known issues
Some users have reported getting a SIGBUS
error from the Nextflow main process.
We suspect this is connected with these Nextflow issues:
* https://github.com/nextflow-io/nextflow/issues/842 * https://github.com/nextflow-io/nextflow/issues/2774
Setting the environment variable NXF_OPTS="-Dleveldb.mmap=false"
when executing nextflow
is reported to solve the problem.