rsnt_translations
56,430
edits
No edit summary |
No edit summary |
||
Line 78: | Line 78: | ||
Note that all options passed to <code>--java-options</code> have to be within quotation marks. | Note that all options passed to <code>--java-options</code> have to be within quotation marks. | ||
=== Considerations | === Considerations regarding our systems === <!--T:50--> | ||
To use GATK in our systems we recommend you use the <code>--tmp-dir</code> option and set it to <code>${SLURM_TMPDIR}</code> when in a sbatch job so that the temporary files are redirected to the local storage. | To use GATK in our systems we recommend you use the <code>--tmp-dir</code> option and set it to <code>${SLURM_TMPDIR}</code> when in a sbatch job so that the temporary files are redirected to the local storage. | ||
Line 86: | Line 86: | ||
===Earlier versions than GATK 4 === <!--T:14--> | ===Earlier versions than GATK 4 === <!--T:14--> | ||
Earlier versions of GATK do not have the | Earlier versions of GATK do not have the <code>gatk</code> command. Instead, one has to call the jar file: | ||
<!--T:15--> | <!--T:15--> | ||
Line 106: | Line 106: | ||
===Multicore usage === <!--T:19--> | ===Multicore usage === <!--T:19--> | ||
Most GATK (>=4) tools are not multicore by default. This means that you should request only one core when calling these kind of tools. Some tools use threads in some of the computations (e.g. <code>Mutect2</code> has the <code>--native-pair-hmm-threads</code>) and therefore you can require more cpus (most of them with up to 4 threads) for these computations. GATK4, however, does provides | Most GATK (>=4) tools are not multicore by default. This means that you should request only one core when calling these kind of tools. Some tools use threads in some of the computations (e.g. <code>Mutect2</code> has the <code>--native-pair-hmm-threads</code>) and therefore you can require more cpus (most of them with up to 4 threads) for these computations. GATK4, however, does provides <b>some</b> [https://gatk.broadinstitute.org/hc/en-us/articles/360035890591-Spark SPARK commands]: | ||
<!--T:46--> | <!--T:46--> | ||
Line 117: | Line 117: | ||
<!--T:48--> | <!--T:48--> | ||
- Some GATK tools exist in distinct Spark-capable and non-Spark-capable versions. | - Some GATK tools exist in distinct Spark-capable and non-Spark-capable versions. | ||
The "sparkified" versions have the suffix | The "sparkified" versions have the suffix <i>Spark</i> at the end of their names. Many of these are still experimental; down the road we plan to consolidate them so that there will be only one version per tool. | ||
<!--T:49--> | <!--T:49--> | ||
Line 125: | Line 125: | ||
<!--T:22--> | <!--T:22--> | ||
For the commands that do use Spark, you can request multiple cpus. | For the commands that do use Spark, you can request multiple cpus. <b>NOTE:</b> Please provide the exact number of cpus to the spark command. For example if you requested 10 cpus, use <code>--spark-master local[10]</code> instead of <code>--spark-master local[*]</code>. If you want to use multiple nodes to scale the Spark cluster, you have to first [[Apache_Spark|deploy a SPARK cluster]] and then set the appropriate variables in the GATK command. | ||
==Running GATK via Apptainer== <!--T:36--> | ==Running GATK via Apptainer== <!--T:36--> | ||
Line 165: | Line 165: | ||
==Frequently asked questions == <!--T:23--> | ==Frequently asked questions == <!--T:23--> | ||
===How do I add a read group (RG) tag in my bam file? === | ===How do I add a read group (RG) tag in my bam file? === | ||
Assuming that you want to add a read group called | Assuming that you want to add a read group called <i>tag</i> to the file called <i>input.bam</i>, you can use the GATK/PICARD command [https://gatk.broadinstitute.org/hc/en-us/articles/360037226472-AddOrReplaceReadGroups-Picard- AddOrReplaceReadGroups]: | ||
<pre> | <pre> | ||
gatk AddOrReplaceReadGroups \ | gatk AddOrReplaceReadGroups \ |