BUSCO/en: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
(Updating to match new version of source page)
(Updating to match new version of source page)
Line 9: Line 9:
== Available versions ==
== Available versions ==


The version 3.0.2 of BUSCO is installed as a module on cvmfs and accessible on all clusters. See below how to use it. For the [https://gitlab.com/ezlab/busco newer versions], it is possible to install them locally using a [[Python#Creating_and_using_a_virtual_environment|virtual environment]] as follow:  
Version 3.0.2 of BUSCO is installed as a module on CVMFS and accessible on all clusters. See below how to use it.  
 
For the [https://gitlab.com/ezlab/busco newer versions], you can install them in your own account using a [[Python#Creating_and_using_a_virtual_environment|virtual environment]] as follows:  


{{Commands|
{{Commands|
Line 24: Line 26:
and add "home/$USER/busco_env/scripts" to your path.
and add "home/$USER/busco_env/scripts" to your path.


== Usage ==
== Using BUSCO from CVMFS ==
 
'''1.''' Load the necessary modules:
'''1.''' Load the necessary modules:
{{Command|module load gcc/5.4.0 busco/3.0.2 blast+/2.6.0 hmmer/3.1b2 augustus/3.2.3/ emboss/6.6.0 r/3.5.0}}
{{Command|module load StdEnv/2018.3 gcc/7.3.0 openmpi/3.1.4 busco/3.0.2 r/4.0.2}}
This will also load modules for <code>augustus, blast+, hmmer</code> and some other
software packages that BUSCO relies upon.


'''2.''' Copy the configuration file:
'''2.''' Copy the configuration file:
Line 40: Line 45:
[tblastn]
[tblastn]
# path to tblastn
# path to tblastn
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc5.4/blast+/2.6.0/bin
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/blast+/2.7.1/bin/


[makeblastdb]
[makeblastdb]
# path to makeblastdb
# path to makeblastdb
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc5.4/blast+/2.6.0/bin
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/blast+/2.7.1/bin/


[augustus]
[augustus]
# path to augustus
# path to augustus
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc5.4/augustus/3.2.3/bin
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/bin/


[etraining]
[etraining]
# path to augustus etraining
# path to augustus etraining
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc5.4/augustus/3.2.3/bin
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/bin/


# path to augustus perl scripts, redeclare it for each new script
# path to augustus perl scripts, redeclare it for each new script
[gff2gbSmallDNA.pl]
[gff2gbSmallDNA.pl]
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc5.4/augustus/3.2.3/scripts
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/scripts/
[new_species.pl]
[new_species.pl]
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc5.4/augustus/3.2.3/scripts
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/scripts/
[optimize_augustus.pl]
[optimize_augustus.pl]
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc5.4/augustus/3.2.3/scripts
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/scripts/


[hmmsearch]
[hmmsearch]
# path to HMMsearch executable
# path to HMMsearch executable
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc5.4/hmmer/3.1b2/bin
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/hmmer/3.1b2/bin/


[Rscript]
[Rscript]
# path to Rscript, if you wish to use the plot tool
# path to Rscript, if you wish to use the plot tool
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc5.4/r/3.5.0/bin
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/r/4.0.2/bin/
}}
}}


Line 74: Line 79:
{{Command|cp -r $EBROOTAUGUSTUS/config $HOME/augustus_config}}
{{Command|cp -r $EBROOTAUGUSTUS/config $HOME/augustus_config}}


'''5.''' Check that it runs:
'''5.''' Check that it runs.


{{Commands
{{Commands
Line 81: Line 86:
|run_BUSCO.py --in $EBROOTBUSCO/sample_data/target.fa --out TEST --lineage_path $EBROOTBUSCO/sample_data/example --mode genome
|run_BUSCO.py --in $EBROOTBUSCO/sample_data/target.fa --out TEST --lineage_path $EBROOTBUSCO/sample_data/example --mode genome
}}
}}
The <code>run_BUSCO.py</code> command should take less than 60 seconds to complete.
Production runs which take longer should be submitted to the [[Running jobs|scheduler]].


= Troubleshooting =
= Troubleshooting =
== Cannot write to Augustus config path ==
== Cannot write to Augustus config path ==
Make sure you have copied the config directory to a writable location and exported the <tt>AUGUSTUS_CONFIG_PATH</tt> variable.
Make sure you have copied the config directory to a writable location and exported the <tt>AUGUSTUS_CONFIG_PATH</tt> variable.

Revision as of 20:23, 17 March 2021

Other languages:


BUSCO stands for "Benchmarking sets of Universal Single-Copy Orthologs". It is an application for assessing genome assembly and annotation completeness. For more information see the user manual.

Available versions

Version 3.0.2 of BUSCO is installed as a module on CVMFS and accessible on all clusters. See below how to use it.

For the newer versions, you can install them in your own account using a virtual environment as follows:

[name@server ~]$ ~ $ module load python/3.7.4
~ $ git clone https://gitlab.com/ezlab/busco.git
~ $ virtualenv /home/$USER/busco_env
~ $ source /home/$USER/busco_env/bin/activate
(busco_env) [~]$ pip install Biopython
(busco_env) [~]$ cd ~/busco
(busco_env) [~]$ python setup.py install
(busco_env) [~]$ cp -r scripts test_data /home/$USER/busco_env/


and add "home/$USER/busco_env/scripts" to your path.

Using BUSCO from CVMFS

1. Load the necessary modules:

Question.png
[name@server ~]$ module load StdEnv/2018.3 gcc/7.3.0 openmpi/3.1.4 busco/3.0.2 r/4.0.2

This will also load modules for augustus, blast+, hmmer and some other software packages that BUSCO relies upon.

2. Copy the configuration file:

Question.png
[name@server ~]$ cp -v $EBROOTBUSCO/config/config.ini.default $HOME/busco_config.ini

or

Question.png
[name@server ~]$ wget -O $HOME/busco_config.ini https://gitlab.com/ezlab/busco/raw/master/config/config.ini.default

3. Edit the configuration file. The locations of external tools are all specified in the last section, which is shown below:

File : partial_busco_config.ini

[tblastn]
# path to tblastn
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/blast+/2.7.1/bin/

[makeblastdb]
# path to makeblastdb
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/blast+/2.7.1/bin/

[augustus]
# path to augustus
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/bin/

[etraining]
# path to augustus etraining
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/bin/

# path to augustus perl scripts, redeclare it for each new script
[gff2gbSmallDNA.pl]
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/scripts/
[new_species.pl]
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/scripts/
[optimize_augustus.pl]
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/augustus/3.3/scripts/

[hmmsearch]
# path to HMMsearch executable
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/hmmer/3.1b2/bin/

[Rscript]
# path to Rscript, if you wish to use the plot tool
path = /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/gcc7.3/r/4.0.2/bin/


4. Copy the Augustus config directory to a writable location:

Question.png
[name@server ~]$ cp -r $EBROOTAUGUSTUS/config $HOME/augustus_config

5. Check that it runs.

[name@server ~]$ export BUSCO_CONFIG_FILE=$HOME/busco_config.ini
[name@server ~]$ export AUGUSTUS_CONFIG_PATH=$HOME/augustus_config
[name@server ~]$ run_BUSCO.py --in $EBROOTBUSCO/sample_data/target.fa --out TEST --lineage_path $EBROOTBUSCO/sample_data/example --mode genome


The run_BUSCO.py command should take less than 60 seconds to complete. Production runs which take longer should be submitted to the scheduler.

Troubleshooting

Cannot write to Augustus config path

Make sure you have copied the config directory to a writable location and exported the AUGUSTUS_CONFIG_PATH variable.