Parasail: Difference between revisions
(Added parasail and parasail python page.) |
(Marked this version for translation) |
||
Line 2: | Line 2: | ||
<translate> | <translate> | ||
<!--T:1--> | |||
[https://github.com/jeffdaily/parasail parasail] is a SIMD C (C99) library containing implementations of the Smith-Waterman (local), Needleman-Wunsch (global), and various semi-global pairwise sequence alignment algorithms. | [https://github.com/jeffdaily/parasail parasail] is a SIMD C (C99) library containing implementations of the Smith-Waterman (local), Needleman-Wunsch (global), and various semi-global pairwise sequence alignment algorithms. | ||
<!--T:2--> | |||
{{Note | {{Note | ||
|Starting from StdEnv/2023, parasail python extension is now bundled in the module parasail. As for 2020, the module needs to be loaded in order for the python extension to be installed in a virtual environment. | |Starting from StdEnv/2023, parasail python extension is now bundled in the module parasail. As for 2020, the module needs to be loaded in order for the python extension to be installed in a virtual environment. | ||
}} | }} | ||
= Usage = | = Usage = <!--T:3--> | ||
<!--T:4--> | |||
One can find the available version using | One can find the available version using | ||
{{Command|module spider parasail}} | {{Command|module spider parasail}} | ||
<!--T:5--> | |||
and load the library using | and load the library using | ||
{{Command|module load parasail/2.6.2}} | {{Command|module load parasail/2.6.2}} | ||
== parasail_aligner Example == | == parasail_aligner Example == <!--T:6--> | ||
When using the binary <tt>parasail_aligner</tt> it is important to set the number of threads according to the number of cores allocated in our job. We can set it with | When using the binary <tt>parasail_aligner</tt> it is important to set the number of threads according to the number of cores allocated in our job. We can set it with | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
Line 22: | Line 26: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
== Python extension == | == Python extension == <!--T:7--> | ||
The module contains bindings for multiple Python versions. | The module contains bindings for multiple Python versions. | ||
To discover which are the compatible Python versions, run | To discover which are the compatible Python versions, run | ||
{{Command|module spider parasail/1.3.4}} | {{Command|module spider parasail/1.3.4}} | ||
=== Usage === | === Usage === <!--T:8--> | ||
1. Load the required modules. | 1. Load the required modules. | ||
{{Command|module load parasail/2.6.2 python/3.11 scipy-stack/2023b}} | {{Command|module load parasail/2.6.2 python/3.11 scipy-stack/2023b}} | ||
<!--T:9--> | |||
2. Import parasail 1.3.4. | 2. Import parasail 1.3.4. | ||
{{Command|python -c "import parasail"}} | {{Command|python -c "import parasail"}} | ||
<!--T:10--> | |||
If the command displays nothing, the import was successful. | If the command displays nothing, the import was successful. | ||
=== Example === | === Example === <!--T:11--> | ||
Run a quick local alignment score comparison between BioPython and parasail. | Run a quick local alignment score comparison between BioPython and parasail. | ||
<!--T:12--> | |||
1. Write the python script: | 1. Write the python script: | ||
{{File | {{File | ||
Line 47: | Line 54: | ||
from Bio.Align import PairwiseAligner | from Bio.Align import PairwiseAligner | ||
<!--T:13--> | |||
A = "ACGT" * 1000 | A = "ACGT" * 1000 | ||
<!--T:14--> | |||
# parasail | # parasail | ||
matrix = parasail.matrix_create("ACGT", 1, 0) | matrix = parasail.matrix_create("ACGT", 1, 0) | ||
parasail_score = parasail.sw(A, A, 1, 1, matrix).score | parasail_score = parasail.sw(A, A, 1, 1, matrix).score | ||
<!--T:15--> | |||
# biopython | # biopython | ||
bio_score = PairwiseAligner().align(A, A)[0].score | bio_score = PairwiseAligner().align(A, A)[0].score | ||
<!--T:16--> | |||
print('parasail:', parasail_score) | print('parasail:', parasail_score) | ||
print('biopython:', bio_score) | print('biopython:', bio_score) | ||
}} | }} | ||
<!--T:17--> | |||
2. And the job submission script: | 2. And the job submission script: | ||
<tabs> | <tabs> | ||
Line 73: | Line 85: | ||
#SBATCH --time=1:00:00 | #SBATCH --time=1:00:00 | ||
<!--T:18--> | |||
module load parasail/2.6.2 python/3.11 scipy-stack/2023b | module load parasail/2.6.2 python/3.11 scipy-stack/2023b | ||
<!--T:19--> | |||
# Install any other requirements, such as Biopython | # Install any other requirements, such as Biopython | ||
virtualenv --no-download $SLURM_TMPDIR/env | virtualenv --no-download $SLURM_TMPDIR/env | ||
Line 81: | Line 95: | ||
pip install --no-index biopython==1.83 | pip install --no-index biopython==1.83 | ||
<!--T:20--> | |||
python parasail-sw.py | python parasail-sw.py | ||
}} | }} | ||
Line 88: | Line 103: | ||
{{Command|avail_wheel parasail}} | {{Command|avail_wheel parasail}} | ||
<!--T:21--> | |||
Then install the desired version in your virtual environment: | Then install the desired version in your virtual environment: | ||
{{File | {{File | ||
Line 99: | Line 115: | ||
#SBATCH --time=1:00:00 | #SBATCH --time=1:00:00 | ||
<!--T:22--> | |||
module load StdEnv/2020 gcc parasail/2.5 python/3.10 | module load StdEnv/2020 gcc parasail/2.5 python/3.10 | ||
<!--T:23--> | |||
# Install any other requirements, such as Biopython | # Install any other requirements, such as Biopython | ||
virtualenv --no-download $SLURM_TMPDIR/env | virtualenv --no-download $SLURM_TMPDIR/env | ||
Line 107: | Line 125: | ||
pip install --no-index parasail==1.2.4 biopython==1.83 | pip install --no-index parasail==1.2.4 biopython==1.83 | ||
<!--T:24--> | |||
python parasail-sw.py | python parasail-sw.py | ||
}} | }} | ||
Line 112: | Line 131: | ||
</tabs> | </tabs> | ||
<!--T:25--> | |||
3. Then submit the job with | 3. Then submit the job with | ||
{{Command | {{Command | ||
Line 121: | Line 141: | ||
}} | }} | ||
<!--T:26--> | |||
4. The output will be in the slurm output file, once the job has run: | 4. The output will be in the slurm output file, once the job has run: | ||
{{Command | {{Command | ||
Line 129: | Line 150: | ||
}} | }} | ||
==== Available Python packages ==== | ==== Available Python packages ==== <!--T:27--> | ||
Other Python packages that depend on parasail will have their requirement satisfied with the module loaded: | Other Python packages that depend on parasail will have their requirement satisfied with the module loaded: | ||
{{Command | {{Command |
Revision as of 15:49, 11 June 2024
parasail is a SIMD C (C99) library containing implementations of the Smith-Waterman (local), Needleman-Wunsch (global), and various semi-global pairwise sequence alignment algorithms.
Usage
One can find the available version using
[name@server ~]$ module spider parasail
and load the library using
[name@server ~]$ module load parasail/2.6.2
parasail_aligner Example
When using the binary parasail_aligner it is important to set the number of threads according to the number of cores allocated in our job. We can set it with
parasail_aligner -t ${SLURM_CPUS_PER_TASK:-1} ...}}
Python extension
The module contains bindings for multiple Python versions. To discover which are the compatible Python versions, run
[name@server ~]$ module spider parasail/1.3.4
Usage
1. Load the required modules.
[name@server ~]$ module load parasail/2.6.2 python/3.11 scipy-stack/2023b
2. Import parasail 1.3.4.
[name@server ~]$ python -c "import parasail"
If the command displays nothing, the import was successful.
Example
Run a quick local alignment score comparison between BioPython and parasail.
1. Write the python script:
import parasail
from Bio.Align import PairwiseAligner
A = "ACGT" * 1000
# parasail
matrix = parasail.matrix_create("ACGT", 1, 0)
parasail_score = parasail.sw(A, A, 1, 1, matrix).score
# biopython
bio_score = PairwiseAligner().align(A, A)[0].score
print('parasail:', parasail_score)
print('biopython:', bio_score)
2. And the job submission script:
#!/bin/bash
#SBATCH --account=def-someuser # replace with your PI account
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=3G # increase as needed
#SBATCH --time=1:00:00
module load parasail/2.6.2 python/3.11 scipy-stack/2023b
# Install any other requirements, such as Biopython
virtualenv --no-download $SLURM_TMPDIR/env
source $SLURM_TMPDIR/env/bin/activate
pip install --no-index --upgrade pip
pip install --no-index biopython==1.83
python parasail-sw.py
2.1. Identify available wheels first :
[name@server ~]$ avail_wheel parasail
Then install the desired version in your virtual environment:
#!/bin/bash
#SBATCH --account=def-someuser # replace with your PI account
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=3G # increase as needed
#SBATCH --time=1:00:00
module load StdEnv/2020 gcc parasail/2.5 python/3.10
# Install any other requirements, such as Biopython
virtualenv --no-download $SLURM_TMPDIR/env
source $SLURM_TMPDIR/env/bin/activate
pip install --no-index --upgrade pip
pip install --no-index parasail==1.2.4 biopython==1.83
python parasail-sw.py
3. Then submit the job with
[name@server ~]$ sbatch submit-parasail.sh
name version python arch
-------- --------- -------- -------
parasail 1.2.4 py2,py3 generic
4. The output will be in the slurm output file, once the job has run:
[name@server ~]$ less slurm-*.out
parasail: 4000
biopython: 4000.0
Available Python packages
Other Python packages that depend on parasail will have their requirement satisfied with the module loaded:
[name@server ~]$ pip list | grep parasail
parasail 1.3.4