Database servers: Difference between revisions
No edit summary |
|||
Line 19: | Line 19: | ||
The Cedar MySQL server runs as a VM called "cedar-mysql-vm" (full name: cedar-mysql-vm.int.cedar.computecanada.ca) on a database machine. | The Cedar MySQL server runs as a VM called "cedar-mysql-vm" (full name: cedar-mysql-vm.int.cedar.computecanada.ca) on a database machine. | ||
Users who have accounts on the MySQL server are able to connect only through the Cedar headnode | Users who have accounts on the MySQL server are able to connect only through the Cedar headnode | ||
(cedar.computecanada.ca), Cedar compute nodes and | (cedar.computecanada.ca), Cedar compute nodes and cedar portal (https://gateway.cedar.computecanada.ca). | ||
For security, users cannot make an SSH connection to the database server directly. | For security, users cannot make an SSH connection to the database server directly. | ||
Line 26: | Line 26: | ||
If you need the privileges to create your own database, you will need a MySQL account. To get a MySQL account on the Cedar MySQL server, please send a request to support@computecanada.ca with the following information: | If you need the privileges to create your own database, you will need a MySQL account. To get a MySQL account on the Cedar MySQL server, please send a request to support@computecanada.ca with the following information: | ||
* Compute Canada | * Your Compute Canada username | ||
* Amount of database space needed for your project | * Amount of database space needed for your project | ||
We will then create an MySQL account with username which is the same as your computecanada username and a 16 digit random string password. The username ans password along with database server name and other required information to make a MySQL connection will be stored in a file called <code>.my.cnf</code> located in your home directory on Cedar. The file is confidential and only readable by the user, e.g., user cannot modify that because it is owned by "root" but user can delete that. So, please do not delete this file. | |||
be stored in a file called <code>.my.cnf</code> located in your home directory | |||
The file is confidential and readable | |||
Please '''do not''' use the <code>-p</code> option as an argument in running mysql. The required password will be automatically taken from your <code>.my.cnf</code> file if you do not use | Using command {{Command|mysql}} from cedar headnode you will be able to connect to the MySQL server. Please '''do not''' use the <code>-p</code> or <code>-h</code> option as an argument in running mysql. The required password and server name is this case and any other information will be automatically taken from your <code>.my.cnf</code> file if you do not use these arguments. | ||
It is acceptable to submit a long-running SQL command from the Cedar head node, as the | It is acceptable to submit a long-running SQL command from the Cedar head node, as most of the CPU usage is being taken on the database server side. However, if you are running a script which is issuing SQL commands, then it needs to be submitted as a job to the scheduler. See [[Running jobs]] for details. | ||
=== Rules to create a MySQL database === | === Rules to create a MySQL database === | ||
In order to be able to set up MySQL tables and query them, you need to create your own database | In order to be able to set up MySQL tables and query them, you need to create your own database. To create a database, the name of the database is arbitrary but it must start with | ||
<username>_ | |||
For example, if your username were "david" the name of the database '''must''' start with "david_" and the commands to create a database called "david_db1" would be: | where <username> is your MySQL username. For example, if your username were "david" the name of the database '''must''' start with "david_" and the commands to create a database called "david_db1" would be: | ||
{{Command|mysql | {{Command|mysql | ||
Line 62: | Line 48: | ||
}} | }} | ||
Here is an example of how to work with your new database and create a table in it, populate it, and query it: | Users can create multiple MySQL databases if they all start with <username>_. Here is an example of how to work with your new database and create a table in it, populate it, and query it: | ||
{{Command|mysql | {{Command|mysql | ||
Line 74: | Line 60: | ||
}} | }} | ||
The created database will automatically be accessible from the | The created database will automatically be accessible from the cedar head node, cedar compute nodes, and cedar portal (gateway.cedar.computecanada.ca), so you should not need to do any other grant. However, if you want to share your database with other users you can issue this MySQL command: | ||
{{Command|mysql | {{Command|mysql | ||
Line 93: | Line 79: | ||
=== MySQL connectivity for GBrowse === | === MySQL connectivity for GBrowse === | ||
GBrowse is a combination of database and interactive web pages for manipulating and displaying annotations on genomes. | GBrowse is a combination of database and interactive web pages for manipulating and displaying annotations on genomes. | ||
Documentation: http://gmod.org/wiki/GBrowse | Documentation: http://gmod.org/wiki/GBrowse | ||
Line 109: | Line 95: | ||
The Cedar PostgreSQL server runs as a VM called "cedar-pgsql-vm" (full name: cedar-pgsql-vm.int.cedar.computecanada.ca) on a database machine. | The Cedar PostgreSQL server runs as a VM called "cedar-pgsql-vm" (full name: cedar-pgsql-vm.int.cedar.computecanada.ca) on a database machine. | ||
Users who have accounts on the PostgreSQL server are able to connect only through the Cedar headnode | Users who have accounts on the PostgreSQL server are able to connect only through the Cedar headnode | ||
(cedar.computecanada.ca), Cedar compute nodes and | (cedar.computecanada.ca), Cedar compute nodes and Cedar portal machine. | ||
For security, users cannot make an SSH connection to the database server directly. | For security, users cannot make an SSH connection to the database server directly. | ||
To get an account and database on the Cedar PostgreSQL server, send a request to support@computecanada.ca with the following information: | To get an account and database on the Cedar PostgreSQL server, send a request to support@computecanada.ca with the following information: | ||
* | * You Compute Canada username | ||
* Amount of database space needed for your project | * Amount of database space needed for your project | ||
* The name of the database you would like to have | |||
* PostGIS extension required for the database? | * PostGIS extension required for the database? | ||
=== PostgreSQL account and connection === | === PostgreSQL account and connection === | ||
PostgreSQL uses IDENT authentication for connection from compute nodes which means | Your created PostgreSQL account will have the same username as your computecanada. You will be given one database. The name of the database is arbitrary but it must start with "<username>_" where <username> is your computecanada username. PostgreSQL uses IDENT authentication for connection from cedar compute nodes which means no password is needed to be | ||
supplied to your PostgreSQL account. | supplied to your PostgreSQL account. This also means that user A from cedar computenodes cannot make PostgreSQL connection to user B. Using passwordless connection from computenodes is very adventurous because if you have job script in which you need a PostgreSQL connection you do not need to hard coded your computecanada password in to your script which would be a big security concern. | ||
When access is from the head node, however, PostgreSQL uses PAM authentication which means that you will be prompted | When access is from the head node, however, PostgreSQL uses PAM authentication which means that you will be prompted | ||
for your Compute Canada password | for a password. Your password for PostgreSQL connection is the same as your Compute Canada password for this case. Example: | ||
{{Command|psql -h cedar-pgsql-vm -d db_[username]}} | {{Command|psql -h cedar-pgsql-vm -d db_[username]}} |
Revision as of 18:49, 22 March 2018
Database servers available for researchers
Compute Canada offers access to MySQL and Postgres database servers for researchers:
- Cedar MySQL server
- Description: General purpose server for the researcher wanting to set up SQL tables in MySQL and issue SQL commands against them.
- Server name: cedar-mysql-vm.int.cedar.computecanada.ca
- Short server name: cedar-mysql-vm (can be used instead of long name on most compute nodes)
- Version: MariaDB version 10.2 Community Edition
- Documentation: http://www.mariadb.com
- Cedar Postgres server
- Description: General purpose server for the researcher wanting to set up SQL tables in MySQL and issue SQL commands against them. Includes a PostGIS extension available for those needing to do geocoding.
- Server name: cedar-pgsql-vm.int.cedar.computecanada.ca
- Short server name: cedar-pgsql-vm (can be used instead of long name on most compute nodes)
- Version: PostgreSQL version 10.1, PostGIS version 2.4 extension available
- Documentation: https://www.postgresql.org and https://postgis.net/documentation
Cedar MySQL server
The Cedar MySQL server runs as a VM called "cedar-mysql-vm" (full name: cedar-mysql-vm.int.cedar.computecanada.ca) on a database machine. Users who have accounts on the MySQL server are able to connect only through the Cedar headnode (cedar.computecanada.ca), Cedar compute nodes and cedar portal (https://gateway.cedar.computecanada.ca).
For security, users cannot make an SSH connection to the database server directly.
MySQL account and connection
If you need the privileges to create your own database, you will need a MySQL account. To get a MySQL account on the Cedar MySQL server, please send a request to support@computecanada.ca with the following information:
- Your Compute Canada username
- Amount of database space needed for your project
We will then create an MySQL account with username which is the same as your computecanada username and a 16 digit random string password. The username ans password along with database server name and other required information to make a MySQL connection will be stored in a file called .my.cnf
located in your home directory on Cedar. The file is confidential and only readable by the user, e.g., user cannot modify that because it is owned by "root" but user can delete that. So, please do not delete this file.
Using command
[name@server ~]$ mysql
from cedar headnode you will be able to connect to the MySQL server. Please do not use the -p
or -h
option as an argument in running mysql. The required password and server name is this case and any other information will be automatically taken from your .my.cnf
file if you do not use these arguments.
It is acceptable to submit a long-running SQL command from the Cedar head node, as most of the CPU usage is being taken on the database server side. However, if you are running a script which is issuing SQL commands, then it needs to be submitted as a job to the scheduler. See Running jobs for details.
Rules to create a MySQL database
In order to be able to set up MySQL tables and query them, you need to create your own database. To create a database, the name of the database is arbitrary but it must start with
<username>_
where <username> is your MySQL username. For example, if your username were "david" the name of the database must start with "david_" and the commands to create a database called "david_db1" would be:
[name@server ~]$ mysql
mysql> CREATE DATABASE david_db1;
mysql> quit
Users can create multiple MySQL databases if they all start with <username>_. Here is an example of how to work with your new database and create a table in it, populate it, and query it:
[name@server ~]$ mysql
mysql> USE david_db1;
mysql> CREATE TABLE fubar (age integer, id varchar(10));
mysql> INSERT INTO fubar VALUES (34, '1122');
mysql> INSERT INTO fubar VALUES (22, '2233');
mysql> SELECT * FROM fubar WHERE age > 30;
mysql> SELECT age FROM fubar WHERE id = '1122';
mysql> quit
The created database will automatically be accessible from the cedar head node, cedar compute nodes, and cedar portal (gateway.cedar.computecanada.ca), so you should not need to do any other grant. However, if you want to share your database with other users you can issue this MySQL command:
[name@server ~]$ mysql
mysql> GRANT SELECT ON [database name].* TO '[username 2]'@'172.%';
mysql> quit
where [username 2] is the MySQL user who takes that grant.
MySQL connectivity for BioPerl
BioPerl is a collection of open source Perl tools for bioinformatics, genomics and life science. Documentation can be found at bioperl.org.
There are several BioPerl modules which can be used to upload data to a database. To connect to a MySQL server from one of these, the BioPerl command line should contain the -d option as follows:
-d [database name]:mysql_read_default_file=.my.cnf ....
MySQL connectivity for GBrowse
GBrowse is a combination of database and interactive web pages for manipulating and displaying annotations on genomes.
Documentation: http://gmod.org/wiki/GBrowse
In order to connect to MySQL from gbrowse, the corresponding line to connect to MySQL should contain:
db_args = -adaptor DBI::mysql -dsn [database name];mysql_read_default_file=/home/[username]/.my.cnf -user [username]
where [username] is the corresponding user name and [database name] is the name of the database.
Cedar PostgreSQL server
The Cedar PostgreSQL server runs as a VM called "cedar-pgsql-vm" (full name: cedar-pgsql-vm.int.cedar.computecanada.ca) on a database machine. Users who have accounts on the PostgreSQL server are able to connect only through the Cedar headnode (cedar.computecanada.ca), Cedar compute nodes and Cedar portal machine.
For security, users cannot make an SSH connection to the database server directly.
To get an account and database on the Cedar PostgreSQL server, send a request to support@computecanada.ca with the following information:
- You Compute Canada username
- Amount of database space needed for your project
- The name of the database you would like to have
- PostGIS extension required for the database?
PostgreSQL account and connection
Your created PostgreSQL account will have the same username as your computecanada. You will be given one database. The name of the database is arbitrary but it must start with "<username>_" where <username> is your computecanada username. PostgreSQL uses IDENT authentication for connection from cedar compute nodes which means no password is needed to be supplied to your PostgreSQL account. This also means that user A from cedar computenodes cannot make PostgreSQL connection to user B. Using passwordless connection from computenodes is very adventurous because if you have job script in which you need a PostgreSQL connection you do not need to hard coded your computecanada password in to your script which would be a big security concern.
When access is from the head node, however, PostgreSQL uses PAM authentication which means that you will be prompted for a password. Your password for PostgreSQL connection is the same as your Compute Canada password for this case. Example:
[name@server ~]$ psql -h cedar-pgsql-vm -d db_[username]
where db_[username] is the name of the database that was set up for you with your PostgreSQL account. (If you require more databases to be set up for your PostgreSQL account to use, please send a request to support@computecanada.ca).
The example above runs an older version of the psql Postgres interactive client that by default is installed on all the nodes:
[name@server ~]$ psql --version
psql (PostgreSQL) 9.6.2
You can load the most recent version of the psql Postgres client which will stay loaded until you log off from your session. Example:
[name@server ~]$ module load postgresql
[name@server ~]$ psql --version
psql (PostgreSQL) 10.2
PostgreSQL connectivity for BioPerl
To connect to Postgres from a Perl module, for example from one of BioPerl modules, the command line should contain "-a" and "-d" options as follows:
-a DBI::Pg -d dbi:Pg:dbname=[database name] ....
where [database name] is the name of your database.
PostgreSQL connectivity for gbrowse
GBrowse is a combination of database and interactive web pages for manipulating and displaying annotations on genomes.
Documentation: http://gmod.org/wiki/GBrowse
In order to connect to postres from gbrowse, the corresponding line in the configuration file should contain:
db_args = -dsn dbi:Pg:dbname=[database name ] -user [username]