GBrowse: Difference between revisions
No edit summary |
No edit summary |
||
Line 26: | Line 26: | ||
<!--T:10--> | <!--T:10--> | ||
/project/GROUPID/gbrowse/USERNAME/conf | /project/''GROUPID''/gbrowse/''USERNAME''/conf | ||
<!--T:11--> | <!--T:11--> | ||
where <code>GROUPID</code> is your group id and <code>USERNAME</code> is your user name. We will create a symbolic link from <code>${HOME}/gbrowse-config/</code> to this directory for your convenience. Files in this directory should be readable by all members of the group, so please do not change the group permission of files in this directory. | where <code>''GROUPID''</code> is your group id and <code>''USERNAME''</code> is your user name. We will create a symbolic link from <code>${HOME}/gbrowse-config/</code> to this directory for your convenience. Files in this directory should be readable by all members of the group, so please do not change the group permission of files in this directory. | ||
===Configuring the database connection=== <!--T:12--> | ===Configuring the database connection=== <!--T:12--> | ||
Line 40: | Line 40: | ||
db_adaptor = Bio::DB::SeqFeature::Store | db_adaptor = Bio::DB::SeqFeature::Store | ||
db_args = -adaptor DBI::mysql | db_args = -adaptor DBI::mysql | ||
-dsn DATABASE;mysql_read_default_file=/home/SHARED/.my.cnf | -dsn ''DATABASE'';mysql_read_default_file=/home/''SHARED''/.my.cnf | ||
<!--T:15--> | <!--T:15--> | ||
where <code>DATABASE</code> is the name of your database and <code>SHARED</code> is the shared-account. The <code>.my.cnf</code> file is a text file that is created by CC staff. It contains information required for the shared account to make a connection to MySQL. | where <code>''DATABASE''</code> is the name of your database and <code>''SHARED''</code> is the shared-account. The <code>.my.cnf</code> file is a text file that is created by CC staff. It contains information required for the shared account to make a connection to MySQL. | ||
<!--T:16--> | <!--T:16--> | ||
Line 52: | Line 52: | ||
db_adaptor = Bio::DB::SeqFeature::Store | db_adaptor = Bio::DB::SeqFeature::Store | ||
db_args = -adaptor DBI::Pg | db_args = -adaptor DBI::Pg | ||
-dsn = dbi:Pg:dbname=DATABASE | -dsn = dbi:Pg:dbname=''DATABASE'' | ||
where <code>DATABASE</code> is the name of your database. | where <code>''DATABASE''</code> is the name of your database. | ||
==Using GBrowse== <!--T:18--> | ==Using GBrowse== <!--T:18--> | ||
Line 61: | Line 61: | ||
<!--T:20--> | <!--T:20--> | ||
GBrowse is able to read <code>.bam</code> files directly. You do not need to upload them to the database in order to display them. If you want GBrowse to read these .bam files: | GBrowse is able to read <code>.bam</code> files directly. You do not need to upload them to the database in order to display them. If you want GBrowse to read these .bam files directly: | ||
* Files need to be copied to your <code>/project</code> directory and they should be readable by the group. | * Files need to be copied to your <code>/project</code> directory and they should be readable by the group. | ||
* The directory that contains the <code>.bam</code> files must have the setgid and group-execute bits turned on; that is, the output of <code>ls –l</code> must show a small "s" in the group-execute field (not large "S"). | * The directory that contains the <code>.bam</code> files must have the setgid and group-execute bits turned on; that is, the output of <code>ls –l</code> must show a small "s" in the group-execute field (not large "S"). | ||
Line 70: | Line 70: | ||
[example_bam:database] | [example_bam:database] | ||
db_adaptor = Bio::DB::Sam | db_adaptor = Bio::DB::Sam | ||
db_args = -bam /project/GROUPID/USERNAME/gbrowse_bam_files/example_file.bam | db_args = -bam /project/''GROUPID''/''USERNAME''/gbrowse_bam_files/example_file.bam | ||
search options = default | search options = default | ||
Line 80: | Line 80: | ||
<!--T:24--> | <!--T:24--> | ||
module load bioperl/1.7.1 | module load bioperl/1.7.1 | ||
bp_seqfeature_load.pl -c –d DATABASE:mysql_read_default_file=/home/USERNAME/.my.cnf \ | bp_seqfeature_load.pl -c –d ''DATABASE'':mysql_read_default_file=/home/''USERNAME''/.my.cnf \ | ||
example_genomic_sequence.fa header_file | example_genomic_sequence.fa header_file | ||
<!--T:25--> | <!--T:25--> | ||
In this example DATABASE is the name of your database and <code>example_genomic_sequence.fa</code> is the [https://en.wikipedia.org/wiki/FASTA_format fasta file] containing the entire genome that you want to visualize with GBrowse. <code>header_file</code> contains details about the length of the chromosomes. Here is an example of a header file: | In this example <code>''DATABASE''</code> is the name of your database and <code>example_genomic_sequence.fa</code> is the [https://en.wikipedia.org/wiki/FASTA_format fasta file] containing the entire genome that you want to visualize with GBrowse. <code>header_file</code> contains details about the length of the chromosomes. Here is an example of a header file: | ||
<!--T:26--> | <!--T:26--> | ||
Line 99: | Line 99: | ||
<!--T:27--> | <!--T:27--> | ||
We remind you that the above commands should be run via the [[Running jobs|job scheduler]]. Do not run these on the head node! | We remind you that the above commands should be run via the [[Running jobs|job scheduler]]. Do not run these on the head node! | ||
Once you uploaded your data to your database you need to grant view access to the SHARED account so that GBrowse is able to access your database for reading. This can be done by running the following commands, | |||
for MySQL: | |||
<!--T:28--> | |||
and Postgres respectively: | |||
</translate> | </translate> |
Revision as of 23:14, 26 July 2019
Introduction
GBrowse is a combination of database and interactive web pages for manipulating and displaying annotations on genomes. It requires a web interface to display. GBrowse has been installed on Cedar. The web address of the installation is https://gateway.cedar.computecanada.ca.
The Cedar installation differs in some ways from the standard GBrowse setup described at the official website: http://gmod.org/wiki/GBrowse, particularly with regard to authentication and authorization.
Requesting access to GBrowse
In order for GBrowse to be able to access your files and directories, Compute Canada (CC) staff will create a shared account for each research group that requests access to GBrowse. While using GBrowse, any member of a research group can read GBrowse config files and input files belonging to any other member of that group. If you wish to use GBrowse, the Principal Investigator (PI) of your group must agree to this change from the usual file security practices. Have the PI write to support@computecanada.ca indicating that they want a GBrowse account to be created for the group, and that they understand the implications of the shared account.
You must also have a database account on Cedar. If you already have one, please give the name of the database in your email. If you do not already have a database account, please read Database servers carefully and answer the questions given there for setting up a database.
Setting up GBrowse
Config files
Since Gbrowse needs to be able to read config files of all users within a group, place your GBrowse config files in the following directory:
/project/GROUPID/gbrowse/USERNAME/conf
where GROUPID
is your group id and USERNAME
is your user name. We will create a symbolic link from ${HOME}/gbrowse-config/
to this directory for your convenience. Files in this directory should be readable by all members of the group, so please do not change the group permission of files in this directory.
Configuring the database connection
If you use MySQL, you need the following in your GBrowse config files:
[username_example_genome:database] db_adaptor = Bio::DB::SeqFeature::Store db_args = -adaptor DBI::mysql -dsn DATABASE;mysql_read_default_file=/home/SHARED/.my.cnf
where DATABASE
is the name of your database and SHARED
is the shared-account. The .my.cnf
file is a text file that is created by CC staff. It contains information required for the shared account to make a connection to MySQL.
If you decide to use Postgres you need the following in your GBrowse config files:
[username_example_genome:database] db_adaptor = Bio::DB::SeqFeature::Store db_args = -adaptor DBI::Pg -dsn = dbi:Pg:dbname=DATABASE
where DATABASE
is the name of your database.
Using GBrowse
Input files
GBrowse is able to read .bam
files directly. You do not need to upload them to the database in order to display them. If you want GBrowse to read these .bam files directly:
- Files need to be copied to your
/project
directory and they should be readable by the group. - The directory that contains the
.bam
files must have the setgid and group-execute bits turned on; that is, the output ofls –l
must show a small "s" in the group-execute field (not large "S"). - Make sure that the
.bam
file's group-ownership is set to your group and not your username. For example,jsmith:jsmith
is wrong,jsmith:def-kjones
is right. - Edit your config file to specify the path to the
.bam
file. Here is an example:
[example_bam:database] db_adaptor = Bio::DB::Sam db_args = -bam /project/GROUPID/USERNAME/gbrowse_bam_files/example_file.bam search options = default
Uploading files to the database
This can be done using BioPerl. Here are commands that need to be run.
module load bioperl/1.7.1 bp_seqfeature_load.pl -c –d DATABASE:mysql_read_default_file=/home/USERNAME/.my.cnf \ example_genomic_sequence.fa header_file
In this example DATABASE
is the name of your database and example_genomic_sequence.fa
is the fasta file containing the entire genome that you want to visualize with GBrowse. header_file
contains details about the length of the chromosomes. Here is an example of a header file:
##sequence-region I 1 15072434 ##sequence-region II 1 15279421 ##sequence-region III 1 13783801 ##sequence-region IV 1 17493829 ##sequence-region V 1 20924180 ##sequence-region X 1 17718942 ##sequence-region MtDNA 1 13794
We remind you that the above commands should be run via the job scheduler. Do not run these on the head node!
Once you uploaded your data to your database you need to grant view access to the SHARED account so that GBrowse is able to access your database for reading. This can be done by running the following commands,
for MySQL:
and Postgres respectively: