GNU Parallel: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 91: Line 91:


==Running on Multiple Nodes== <!--T:10-->
==Running on Multiple Nodes== <!--T:10-->
{{Warning
|title=Not recommended
|content=
While GNU parallel can be used across multiple nodes, it can have problems doing so, and it is not recommended, in particular in the context of a lot of short jobs. That is because it needs to start SSH session on remote nodes, which is an operation which can hang, and which requires multiple seconds. If you choose to use it, make sure you add a delay between jobs of 30 seconds or more, using the option <tt>--sshdelay 30</tt>
}}
You can also use GNU Parallel to distribute a workload across multiple nodes in a cluster, such as in the context of a job on a Compute Canada server. An example of this use is the following:
You can also use GNU Parallel to distribute a workload across multiple nodes in a cluster, such as in the context of a job on a Compute Canada server. An example of this use is the following:
{{Command
{{Command
Line 96: Line 102:
}}
}}
{{Command
{{Command
|parallel --jobs $SLURM_NTASKS_PER_NODE --sshloginfile ./node_list_${SLURM_JOB_ID} --env MY_VARIABLE --workdir $PWD ./my_program
|parallel --jobs $SLURM_NTASKS_PER_NODE --sshloginfile ./node_list_${SLURM_JOB_ID} --env MY_VARIABLE --workdir $PWD --sshdelay 30 ./my_program
}}
}}
In this case, we create a file containing the list of nodes, and we use this file to tell GNU Parallel which nodes to use for the distribution of tasks. The <tt>--env</tt> option allows us to transfer a named environment variable to all the nodes while the <tt>--workdir</tt> option ensures that the GNU Parallel tasks will start in the same directory as the main node.
In this case, we create a file containing the list of nodes, and we use this file to tell GNU Parallel which nodes to use for the distribution of tasks. The <tt>--env</tt> option allows us to transfer a named environment variable to all the nodes while the <tt>--workdir</tt> option ensures that the GNU Parallel tasks will start in the same directory as the main node.
Line 103: Line 109:
For example, when a long list of [[OpenMP]] tasks are executed as a single job submitted with <tt>--nodes=N</tt>, <tt>--ntasks-per-node=5</tt> and <tt>--cpus-per-task=8</tt>, the following command will take into account all processes to be started on all reserved nodes and the number of OpenMP threads per process:
For example, when a long list of [[OpenMP]] tasks are executed as a single job submitted with <tt>--nodes=N</tt>, <tt>--ntasks-per-node=5</tt> and <tt>--cpus-per-task=8</tt>, the following command will take into account all processes to be started on all reserved nodes and the number of OpenMP threads per process:
{{Command
{{Command
|parallel --jobs $SLURM_NTASKS_PER_NODE --sshloginfile ./node_list_${SLURM_JOB_ID} --workdir $PWD --env OMP_NUM_THREADS{{=}}$SLURM_CPUS_PER_TASK ./my_program
|parallel --jobs $SLURM_NTASKS_PER_NODE --sshloginfile ./node_list_${SLURM_JOB_ID} --workdir $PWD --env OMP_NUM_THREADS{{=}}$SLURM_CPUS_PER_TASK --sshdelay 30 ./my_program
}}
}}
In this case, up to <tt>5*N</tt> OpenMP processes are running simultaneously with a CPU usage of up to 800% each.
In this case, up to <tt>5*N</tt> OpenMP processes are running simultaneously with a CPU usage of up to 800% each.
Bureaucrats, cc_docs_admin, cc_staff, rsnt_translations
2,837

edits

Navigation menu