BUSCO: Difference between revisions
(Marked this version for translation) |
(new module list and config.ini for StdEnv/2018.3) |
||
Line 12: | Line 12: | ||
<!--T:11--> | <!--T:11--> | ||
Version 3.0.2 of BUSCO is installed as a module on CVMFS and accessible on all clusters. See below how to use it. | |||
For the [https://gitlab.com/ezlab/busco newer versions], you can install them in your own account using a [[Python#Creating_and_using_a_virtual_environment|virtual environment]] as follows: | |||
<!--T:12--> | <!--T:12--> | ||
Line 29: | Line 31: | ||
and add "home/$USER/busco_env/scripts" to your path. | and add "home/$USER/busco_env/scripts" to your path. | ||
== | == Using BUSCO from CVMFS == <!--T:2--> | ||
'''1.''' Load the necessary modules: | '''1.''' Load the necessary modules: | ||
{{Command|module load gcc/ | {{Command|module load StdEnv/2018.3 gcc/7.3.0 openmpi/3.1.4 busco/3.0.2 r/4.0.2}} | ||
This will also load modules for <code>augustus, blast+, hmmer</code> and some other | |||
software packages that BUSCO relies upon. | |||
<!--T:3--> | <!--T:3--> | ||
Line 48: | Line 53: | ||
[tblastn] | [tblastn] | ||
# path to tblastn | # path to tblastn | ||
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/ | path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/blast+/2.7.1/bin/ | ||
[makeblastdb] | [makeblastdb] | ||
# path to makeblastdb | # path to makeblastdb | ||
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/ | path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/blast+/2.7.1/bin/ | ||
[augustus] | [augustus] | ||
# path to augustus | # path to augustus | ||
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/ | path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/bin/ | ||
[etraining] | [etraining] | ||
# path to augustus etraining | # path to augustus etraining | ||
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/ | path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/bin/ | ||
# path to augustus perl scripts, redeclare it for each new script | # path to augustus perl scripts, redeclare it for each new script | ||
[gff2gbSmallDNA.pl] | [gff2gbSmallDNA.pl] | ||
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/ | path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/scripts/ | ||
[new_species.pl] | [new_species.pl] | ||
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/ | path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/scripts/ | ||
[optimize_augustus.pl] | [optimize_augustus.pl] | ||
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/ | path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/scripts/ | ||
[hmmsearch] | [hmmsearch] | ||
# path to HMMsearch executable | # path to HMMsearch executable | ||
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/ | path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/hmmer/3.1b2/bin/ | ||
[Rscript] | [Rscript] | ||
# path to Rscript, if you wish to use the plot tool | # path to Rscript, if you wish to use the plot tool | ||
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/ | path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/r/4.0.2/bin/ | ||
}} | }} | ||
<translate> | <translate> | ||
Line 85: | Line 90: | ||
<!--T:5--> | <!--T:5--> | ||
'''5.''' Check that it runs | '''5.''' Check that it runs. | ||
<!--T:7--> | <!--T:7--> | ||
Line 93: | Line 98: | ||
|run_BUSCO.py --in $EBROOTBUSCO/sample_data/target.fa --out TEST --lineage_path $EBROOTBUSCO/sample_data/example --mode genome | |run_BUSCO.py --in $EBROOTBUSCO/sample_data/target.fa --out TEST --lineage_path $EBROOTBUSCO/sample_data/example --mode genome | ||
}} | }} | ||
The <code>run_BUSCO.py</code> command should take less than 60 seconds to complete. | |||
Production runs which take longer should be submitted to the [[Running jobs|scheduler]]. | |||
= Troubleshooting = <!--T:9--> | = Troubleshooting = <!--T:9--> |
Revision as of 16:08, 15 March 2021
BUSCO stands for "Benchmarking sets of Universal Single-Copy Orthologs".
It is an application for assessing genome assembly and annotation completeness.
For more information see the user manual.
Available versions[edit]
Version 3.0.2 of BUSCO is installed as a module on CVMFS and accessible on all clusters. See below how to use it.
For the newer versions, you can install them in your own account using a virtual environment as follows:
[name@server ~]$ ~ $ module load python/3.7.4
~ $ git clone https://gitlab.com/ezlab/busco.git
~ $ virtualenv /home/$USER/busco_env
~ $ source /home/$USER/busco_env/bin/activate
(busco_env) [~]$ pip install Biopython
(busco_env) [~]$ cd ~/busco
(busco_env) [~]$ python setup.py install
(busco_env) [~]$ cp -r scripts test_data /home/$USER/busco_env/
and add "home/$USER/busco_env/scripts" to your path.
Using BUSCO from CVMFS[edit]
1. Load the necessary modules:
[name@server ~]$ module load StdEnv/2018.3 gcc/7.3.0 openmpi/3.1.4 busco/3.0.2 r/4.0.2
This will also load modules for augustus, blast+, hmmer
and some other
software packages that BUSCO relies upon.
2. Copy the configuration file:
[name@server ~]$ cp -v $EBROOTBUSCO/config/config.ini.default $HOME/busco_config.ini
or
[name@server ~]$ wget -O $HOME/busco_config.ini https://gitlab.com/ezlab/busco/raw/master/config/config.ini.default
3. Edit the configuration file. The locations of external tools are all specified in the last section, which is shown below:
[tblastn]
# path to tblastn
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/blast+/2.7.1/bin/
[makeblastdb]
# path to makeblastdb
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/blast+/2.7.1/bin/
[augustus]
# path to augustus
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/bin/
[etraining]
# path to augustus etraining
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/bin/
# path to augustus perl scripts, redeclare it for each new script
[gff2gbSmallDNA.pl]
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/scripts/
[new_species.pl]
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/scripts/
[optimize_augustus.pl]
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/scripts/
[hmmsearch]
# path to HMMsearch executable
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/hmmer/3.1b2/bin/
[Rscript]
# path to Rscript, if you wish to use the plot tool
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/r/4.0.2/bin/
4. Copy the Augustus config directory to a writable location:
[name@server ~]$ cp -r $EBROOTAUGUSTUS/config $HOME/augustus_config
5. Check that it runs.
[name@server ~]$ export BUSCO_CONFIG_FILE=$HOME/busco_config.ini
[name@server ~]$ export AUGUSTUS_CONFIG_PATH=$HOME/augustus_config
[name@server ~]$ run_BUSCO.py --in $EBROOTBUSCO/sample_data/target.fa --out TEST --lineage_path $EBROOTBUSCO/sample_data/example --mode genome
The run_BUSCO.py
command should take less than 60 seconds to complete.
Production runs which take longer should be submitted to the scheduler.
Troubleshooting[edit]
Cannot write to Augustus config path[edit]
Make sure you have copied the config directory to a writable location and exported the AUGUSTUS_CONFIG_PATH variable.