cc_staff
284
edits
(Added section for handling large files) |
(Marked this version for translation) |
||
Line 72: | Line 72: | ||
Note that this will also start subjobs that were not considered before. | Note that this will also start subjobs that were not considered before. | ||
==Handling large files== | ==Handling large files== <!--T:15--> | ||
Let say we want to count the characters in parallel from a big [https://en.wikipedia.org/wiki/FASTA_format FASTA] file (<tt>database.fa</tt>) in a task with 8 cores. We will have to use the GNU Parallel <tt>--pipepart</tt> and <tt>--block</tt> arguments to efficiently handle chunks of the file. Using the following command : | Let say we want to count the characters in parallel from a big [https://en.wikipedia.org/wiki/FASTA_format FASTA] file (<tt>database.fa</tt>) in a task with 8 cores. We will have to use the GNU Parallel <tt>--pipepart</tt> and <tt>--block</tt> arguments to efficiently handle chunks of the file. Using the following command : | ||
<!--T:16--> | |||
{{Command|parallel --jobs $SLURM_CPUS_PER_TASK --keep-order --block -1 --recstart '>' --pipepart wc :::: database.fa}} | {{Command|parallel --jobs $SLURM_CPUS_PER_TASK --keep-order --block -1 --recstart '>' --pipepart wc :::: database.fa}} | ||
<!--T:17--> | |||
and by varying the <tt>block</tt> size we get : | and by varying the <tt>block</tt> size we get : | ||
<!--T:18--> | |||
{| class="wikitable" | {| class="wikitable" | ||
! | ! |