GNU Parallel: Difference between revisions

Marked this version for translation
(Added section for handling large files)
(Marked this version for translation)
Line 72: Line 72:
Note that this will also start subjobs that were not considered before.
Note that this will also start subjobs that were not considered before.


==Handling large files==
==Handling large files== <!--T:15-->
Let say we want to count the characters in parallel from a big [https://en.wikipedia.org/wiki/FASTA_format FASTA] file (<tt>database.fa</tt>) in a task with 8 cores. We will have to use the GNU Parallel <tt>--pipepart</tt> and <tt>--block</tt> arguments to efficiently handle chunks of the file. Using the following command :
Let say we want to count the characters in parallel from a big [https://en.wikipedia.org/wiki/FASTA_format FASTA] file (<tt>database.fa</tt>) in a task with 8 cores. We will have to use the GNU Parallel <tt>--pipepart</tt> and <tt>--block</tt> arguments to efficiently handle chunks of the file. Using the following command :


<!--T:16-->
{{Command|parallel --jobs $SLURM_CPUS_PER_TASK --keep-order --block -1 --recstart '>' --pipepart wc :::: database.fa}}
{{Command|parallel --jobs $SLURM_CPUS_PER_TASK --keep-order --block -1 --recstart '>' --pipepart wc :::: database.fa}}


<!--T:17-->
and by varying the <tt>block</tt> size we get :
and by varying the <tt>block</tt> size we get :


<!--T:18-->
{| class="wikitable"
{| class="wikitable"
!  
!  
cc_staff
284

edits