MAFFT

From Alliance Doc
Jump to navigation Jump to search
This site replaces the former Compute Canada documentation site, and is now being managed by the Digital Research Alliance of Canada.

Ce site remplace l'ancien site de documentation de Calcul Canada et est maintenant géré par l'Alliance de recherche numérique du Canada.

Other languages:

MAFFT is a multiple sequence alignment program for unix-like operating systems. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <∼200 sequences), FFT-NS-2 (fast; for alignment of <∼30,000 sequences), etc.

Single node[edit]

MAFFT can benefit from multiple cores on a single node. For more information: https://mafft.cbrc.jp/alignment/software/multithreading.html

Note: The MAFFT_TMPDIR is set to $SLURM_TMPDIR/maffttmp when you load the module.


File : mafft_submit.sh

#!/bin/bash

#SBATCH --time=24:00:00
#SBATCH --nodes=1
#SBATCH --cpus-per-task=32
#SBATCH --mem=0

module load gcc/9.3.0 mafft

mafft --globalpair --thread $SLURM_CPUS_PER_TASK input > output


Multiple nodes (MPI)[edit]

MAFFT can use MPI to align a large number of sequences: https://mafft.cbrc.jp/alignment/software/mpi.html

Note: MAFFT_TMPDIR is set to $SCRATCH/maffttmp when you load the module. If you change this temporary directory, it must be shared by all hosts.


File : mafft_mpi_submit.sh

#!/bin/bash

#SBATCH --time=04:00:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=1
#SBATCH --mem=12G

module load gcc/9.3.0 mafft-mpi

srun mafft --mpi --large --globalpair --thread $SLURM_CPUS_PER_TASK input > output