MAFFT

From Alliance Doc
Jump to navigation Jump to search
Other languages:

MAFFT is a multiple sequence alignment program for unix-like operating systems. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <∼200 sequences), FFT-NS-2 (fast; for alignment of <∼30,000 sequences), etc.

Single node

MAFFT can benefit from multiple cores on a single node. For more information: https://mafft.cbrc.jp/alignment/software/multithreading.html

Note: The MAFFT_TMPDIR is set to $SLURM_TMPDIR/maffttmp when you load the module.


File : mafft_submit.sh

#!/bin/bash

#SBATCH --time=24:00:00
#SBATCH --nodes=1
#SBATCH --cpus-per-task=32
#SBATCH --mem=0

module load gcc/9.3.0 mafft

mafft --globalpair --thread $SLURM_CPUS_PER_TASK input > output


Multiple nodes (MPI)

MAFFT can use MPI to align a large number of sequences: https://mafft.cbrc.jp/alignment/software/mpi.html

Note: MAFFT_TMPDIR is set to $SCRATCH/maffttmp when you load the module. If you change this temporary directory, it must be shared by all hosts.


File : mafft_mpi_submit.sh

#!/bin/bash

#SBATCH --time=04:00:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=1
#SBATCH --mem=12G

module load gcc/9.3.0 mafft-mpi

srun mafft --mpi --large --globalpair --thread $SLURM_CPUS_PER_TASK input > output