Bureaucrats, cc_docs_admin, cc_staff
2,879
edits
(mention license requirement) |
(Remove advertising language in into) |
||
Line 1: | Line 1: | ||
{{Draft}} | {{Draft}} | ||
Parabricks is a software suite for performing secondary analysis of next generation sequencing (NGS) DNA data. Parabricks is extremely fast: It can analyze the whole human genome in about 45 minutes, compared to about 30 hours for 30x [https://en.wikipedia.org/wiki/Whole-genome_shotgun WGS] data. It achieves this performance through tight integration with GPUs. | |||
You can learn more at [http://www.nvidia.com/parabricks www.nvidia.com/parabricks] | You can learn more at [http://www.nvidia.com/parabricks www.nvidia.com/parabricks] | ||
Line 12: | Line 11: | ||
to use Parabricks on Compute Canada equipment. | to use Parabricks on Compute Canada equipment. | ||
==Finding and loading Parabricks == | == Finding and loading Parabricks == | ||
Parabricks can be looked for as a regular module through module spider: | Parabricks can be looked for as a regular module through module spider: | ||
Line 26: | Line 25: | ||
==Example of | == Example of use == | ||
Before you | Before you use Parabricks, make sure you have gone through the [https://www.nvidia.com/en-us/docs/parabricks/ Parabricks documentation], including their standalone tools and pipelines. Also make sure you know [https://docs.computecanada.ca/wiki/Using_GPUs_with_Slurm how to request GPUs in Compute Canada clusters]. Once you understand the above, you can submit a job like: | ||
<pre> | <pre> | ||
Line 54: | Line 53: | ||
{{Note | {{Note | ||
|Make the path to the files absolute real paths (i.e. with the command <code>realpath .</code>) | |Make the path to the files absolute real paths (i.e. with the command <code>realpath .</code>) | ||
}} | }} | ||
==Common issues == | == Common issues == | ||
===Almost immediate | === Almost immediate failure === | ||
If your first test fails right away, there might be a missing module or some environmental variable clash. To solve this try: | If your first test fails right away, there might be a missing module or some environmental variable clash. To solve this try: | ||
Line 70: | Line 69: | ||
}} | }} | ||
===Later | === Later failure === | ||
Often Parabricks may not give you a clear traceback of the failure. This usually means that that you did not request enough memory. If you are reserving a full node already through <code>--nodes=1</code>, we suggest you also use all the memory in the node with <code>--mem=0</code>. Otherwise, make sure that your pipeline has enough memory to process your data. | |||
==Hybrid usage == | == Hybrid usage == | ||
Parabricks uses both CPU and GPUs. During our tests, Parabricks used at least 10 CPUs, so we recommend to ask for at least that amount through <code>--cpus-per-task=10</code> | Parabricks uses both CPU and GPUs. During our tests, Parabricks used at least 10 CPUs, so we recommend to ask for at least that amount through <code>--cpus-per-task=10</code> |