Genomics data

From CC Doc
Jump to navigation Jump to search
This site replaces the former Compute Canada documentation site, and is now being managed by the Digital Research Alliance of Canada.

Ce site remplace l'ancien site de documentation de Calcul Canada et est maintenant géré par l'Alliance de recherche numérique du Canada.

This article is a draft

This is not a complete article: This is a Draft, a work in progress that is intended to be published into an article, which may or may not be ready for inclusion in the main wiki. It should not necessarily be considered factual or authoritative.

Other languages:
English • ‎français

In partnership with C3G, we maintain several genome databases that are available on Compute Canada's general purpose clusters (Béluga, Cedar, Graham). In addition to the FASTA sequence, many genomes include aligner indices and annotation files.

When it is available, the genomics data are always located here: /cvmfs/ref.mugqic/genomes.

We encourage you to browse the directory to get more information.

[user@cedar5 ~]$  ls -1 /cvmfs/ref.mugqic/genomes

Available genomes in species/

Common name Species Builds
Human Homo sapiens
  • GRCh38
  • GRCh37
  • hg19
Mouse Mus musculus
  • GRCm38
  • mm10
  • mm9
  • NCBIM37
Rat Rattus norvegicus
  • rn5
  • Rnor_5.0
  • Rnor_6.0
Monkey Macaca mulatta
  • MMUL_1
Chimpanzee Pan troglodytes
  • panTro4
  • CHIMP2.1.4
Baboon Papio anubis
  • PapAnu2.0
Dog Canis familiaris
  • CanFam3.1
Cow Bos taurus
  • UMD3.1
Chicken Gallus gallus
  • Galgal4
Fly Drosophila melanogaster
  • BDGP5
C. Elegans Caenorhabditis elegans
  • WBcel235
Yeast Saccharomyces cerevisiae
  • R64-1-1
Schizosaccharomyces pombe
  • ASM294v2
Bacteria Escherichia coli str k_12 substr dh10b
  • ASM1942v1
pseudomonas aeruginosa pa14
  • Pseu aeru PA14_V1
Pseudomonas aeruginosa UCBPP_PA14
  • ASM1462v1
Plants Arabidopsis thaliana
  • TAIR10