GATK: Difference between revisions

Jump to navigation Jump to search
12 bytes added ,  1 year ago
no edit summary
No edit summary
No edit summary
Line 83: Line 83:


<!--T:51-->
<!--T:51-->
Also, when using <code>GenomicsDBImport</code> make sure to have the option <code>--genomicsdb-shared-posixfs-optimizations</code> enabled as it "Allow[s] for optimizations to improve the usability and performance for shared Posix Filesystems(e.g. NFS, Lustre)". If not possible or if you are using GNU parallel to run multiple intervals at the same time, please copy your database to <code>${SLURM_TMPDIR}</code> and run it from there as your IO operations might disrupt the function of the Filesystem. <code>${SLURM_TMPDIR}</code> is a local storage and therefore is not only faster, but the IO operations would not affect other users.
Also, when using <code>GenomicsDBImport</code> make sure to have the option <code>--genomicsdb-shared-posixfs-optimizations</code> enabled as it "Allow[s] for optimizations to improve the usability and performance for shared Posix Filesystems(e.g. NFS, Lustre)". If not possible or if you are using GNU parallel to run multiple intervals at the same time, please copy your database to <code>${SLURM_TMPDIR}</code> and run it from there as your IO operations might disrupt the function of the filesystem. <code>${SLURM_TMPDIR}</code> is a local storage and therefore is not only faster, but the IO operations would not affect other users.


===Earlier versions than GATK 4 === <!--T:14-->
===Earlier versions than GATK 4 === <!--T:14-->
Line 125: Line 125:


<!--T:22-->
<!--T:22-->
For the commands that do use Spark, you can request multiple cpus. <b>NOTE:</b> Please provide the exact number of cpus to the spark command.  For example if you requested 10 cpus, use <code>--spark-master local[10]</code> instead of <code>--spark-master local[*]</code>. If you want to use multiple nodes to scale the Spark cluster, you have to first [[Apache_Spark|deploy a SPARK cluster]] and then set the appropriate variables in the GATK command.
For the commands that do use Spark, you can request multiple cpus. <b>NOTE:</b> Please provide the exact number of CPUs to the <code>spark</code> command.  For example if you requested 10 CPUs, use <code>--spark-master local[10]</code> instead of <code>--spark-master local[*]</code>. If you want to use multiple nodes to scale the Spark cluster, you have to first [[Apache_Spark|deploy a SPARK cluster]] and then set the appropriate variables in the GATK command.


==Running GATK via Apptainer== <!--T:36-->
==Running GATK via Apptainer== <!--T:36-->


<!--T:37-->
<!--T:37-->
If you encounter errors like "[https://gatk.broadinstitute.org/hc/en-us/community/posts/360067054832-GATK-4-1-7-0-error-java-lang-IllegalArgumentException-malformed-input-off-17635906-length-1 IllegalArgumentException]" while using the installed modules on our clusters, we recommend you to try another workflow by using the program via [[Apptainer]].
If you encounter errors like [https://gatk.broadinstitute.org/hc/en-us/community/posts/360067054832-GATK-4-1-7-0-error-java-lang-IllegalArgumentException-malformed-input-off-17635906-length-1 IllegalArgumentException] while using the installed modules on our clusters, we recommend that you try another workflow by using the program via [[Apptainer]].


<!--T:38-->
<!--T:38-->
Line 213: Line 213:


===FAQ on GATK === <!--T:31-->
===FAQ on GATK === <!--T:31-->
You can find GATK's FAQ's in their [https://gatk.broadinstitute.org/hc/en-us/sections/360007226791-Troubleshooting-GATK4-Issues website].
You can find GATK's [https://gatk.broadinstitute.org/hc/en-us/sections/360007226791-Troubleshooting-GATK4-Issues FAQs on their website].


=References = <!--T:32-->
=References = <!--T:32-->
rsnt_translations
56,430

edits

Navigation menu