Parasail: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
No edit summary
No edit summary
 
(2 intermediate revisions by the same user not shown)
Line 7: Line 7:
<!--T:2-->
<!--T:2-->
{{Note
{{Note
|Starting from StdEnv/2023, the parasail-python extension is now bundled in the module parasail. As for 2020, the module needs to be loaded in order for the python extension to be installed in a virtual environment.
|From StdEnv/2023 onwards, the parasail-python extension is bundled in the parasail module. However, with StdEnv/2020, the parasail module needs to be loaded for the Python extension to be installed in a virtual environment.
}}
}}


Line 13: Line 13:


<!--T:4-->
<!--T:4-->
One can find the available version using  
Find the required versions using  
{{Command|module spider parasail}}
{{Command|module spider parasail}}


Line 21: Line 21:


== parasail_aligner Example == <!--T:6-->
== parasail_aligner Example == <!--T:6-->
When using the binary <tt>parasail_aligner</tt> it is important to set the number of threads according to the number of cores allocated in our job. We can set it with  
When using the binary <tt>parasail_aligner</tt>, it is important to set the number of threads according to the number of cores allocated in our job. We can set it with  
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
parasail_aligner -t ${SLURM_CPUS_PER_TASK:-1} ...}}
parasail_aligner -t ${SLURM_CPUS_PER_TASK:-1} ...}}
Line 46: Line 46:


<!--T:12-->
<!--T:12-->
1. Write the python script:
1. Write the Python script:
{{File
{{File
   |name=parasail-sw.py
   |name=parasail-sw.py
Line 72: Line 72:


<!--T:17-->
<!--T:17-->
2. And the job submission script:
2. Write the job submission script:
<tabs>
<tabs>
<tab name="Default StdEnv">
<tab name="Default StdEnv">
Line 110: Line 110:


<!--T:21-->
<!--T:21-->
Then install the desired version in your virtual environment:
Install the desired version in your virtual environment:
{{File
{{File
   |name=submit-parasail.sh
   |name=submit-parasail.sh
Line 125: Line 125:


<!--T:23-->
<!--T:23-->
# Install any other requirements, such as Biopython
# Install any other requirements, such as Biopython:
virtualenv --no-download $SLURM_TMPDIR/env
virtualenv --no-download $SLURM_TMPDIR/env
source $SLURM_TMPDIR/env/bin/activate
source $SLURM_TMPDIR/env/bin/activate
Line 138: Line 138:


<!--T:25-->
<!--T:25-->
3. Then submit the job with
3. Submit the job with
{{Command
{{Command
|sbatch submit-parasail.sh
|sbatch submit-parasail.sh
Line 144: Line 144:


<!--T:26-->
<!--T:26-->
4. The output will be in the slurm output file, once the job has run:
4. When the job has run, the output will be in the Slurm output file:
{{Command
{{Command
|less slurm-*.out
|less slurm-*.out
Line 153: Line 153:


==== Available Python packages  ==== <!--T:27-->
==== Available Python packages  ==== <!--T:27-->
Other Python packages that depend on parasail will have their requirement satisfied with the module loaded:
Other Python packages that depend on parasail will have their requirement satisfied by loading the parasail module:
{{Command
{{Command
|pip list {{!}} grep parasail
|pip list {{!}} grep parasail

Latest revision as of 14:07, 28 June 2024

Other languages:

parasail is a SIMD C (C99) library containing implementations of the Smith-Waterman (local), Needleman-Wunsch (global), and various semi-global pairwise sequence alignment algorithms.

Light-bulb.pngFrom StdEnv/2023 onwards, the parasail-python extension is bundled in the parasail module. However, with StdEnv/2020, the parasail module needs to be loaded for the Python extension to be installed in a virtual environment.


Usage

Find the required versions using

Question.png
[name@server ~]$ module spider parasail

and load the library using

Question.png
[name@server ~]$ module load parasail/2.6.2

parasail_aligner Example

When using the binary parasail_aligner, it is important to set the number of threads according to the number of cores allocated in our job. We can set it with

parasail_aligner -t ${SLURM_CPUS_PER_TASK:-1} ...}}

Python extension

The module contains bindings for multiple Python versions. To discover which are the compatible Python versions, run

Question.png
[name@server ~]$ module spider parasail/1.3.4

Usage

1. Load the required modules.

Question.png
[name@server ~]$ module load parasail/2.6.2 python/3.11 scipy-stack/2023b

2. Import parasail 1.3.4.

Question.png
[name@server ~]$ python -c "import parasail"

If the command displays nothing, the import was successful.

Example

Run a quick local alignment score comparison between BioPython and parasail.

1. Write the Python script:

File : parasail-sw.py

import parasail
from Bio.Align import PairwiseAligner

A = "ACGT" * 1000

# parasail
matrix = parasail.matrix_create("ACGT", 1, 0)
parasail_score = parasail.sw(A, A, 1, 1, matrix).score

# biopython
bio_score = PairwiseAligner().align(A, A)[0].score

print('parasail:', parasail_score)
print('biopython:', bio_score)


2. Write the job submission script:

File : submit-parasail.sh

#!/bin/bash
#SBATCH --account=def-someuser  # replace with your PI account
#SBATCH --cpus-per-task=1 
#SBATCH --mem-per-cpu=3G      # increase as needed
#SBATCH --time=1:00:00

module load parasail/2.6.2 python/3.11 scipy-stack/2023b

# Install any other requirements, such as Biopython
virtualenv --no-download $SLURM_TMPDIR/env
source $SLURM_TMPDIR/env/bin/activate
pip install --no-index --upgrade pip
pip install --no-index biopython==1.83

python parasail-sw.py


2.1. Identify available wheels first :

Question.png
[name@server ~]$ avail_wheel parasail
name      version    python    arch
--------  ---------  --------  -------
parasail  1.2.4      py2,py3   generic

Install the desired version in your virtual environment:

File : submit-parasail.sh

#!/bin/bash
#SBATCH --account=def-someuser  # replace with your PI account
#SBATCH --cpus-per-task=1 
#SBATCH --mem-per-cpu=3G      # increase as needed
#SBATCH --time=1:00:00

module load StdEnv/2020 gcc parasail/2.5 python/3.10

# Install any other requirements, such as Biopython:
virtualenv --no-download $SLURM_TMPDIR/env
source $SLURM_TMPDIR/env/bin/activate
pip install --no-index --upgrade pip
pip install --no-index parasail==1.2.4 biopython==1.83

python parasail-sw.py


3. Submit the job with

Question.png
[name@server ~]$ sbatch submit-parasail.sh

4. When the job has run, the output will be in the Slurm output file:

Question.png
[name@server ~]$ less slurm-*.out
parasail: 4000
biopython: 4000.0

Available Python packages

Other Python packages that depend on parasail will have their requirement satisfied by loading the parasail module:

Question.png
[name@server ~]$ pip list | grep parasail
parasail                           1.3.4