Bureaucrats, cc_docs_admin, cc_staff
2,306
edits
(Marked the Gnu Parallel page for translation) |
(Marked this version for translation) |
||
Line 1: | Line 1: | ||
<languages /> | <languages /> | ||
<translate> | <translate> | ||
== Introduction == | == Introduction == <!--T:1--> | ||
[http://www.gnu.org/software/parallel/ GNU Parallel] is a tool for running many sequential tasks at the same time on one or more nodes. It is useful for running a large number of sequential tasks, especially if they are short or variable duration, as well as when doing a parameter sweep. We will only cover the basic options here, for more advanced usage, please see the [http://www.gnu.org/software/parallel/man.html official documentation]. | [http://www.gnu.org/software/parallel/ GNU Parallel] is a tool for running many sequential tasks at the same time on one or more nodes. It is useful for running a large number of sequential tasks, especially if they are short or variable duration, as well as when doing a parameter sweep. We will only cover the basic options here, for more advanced usage, please see the [http://www.gnu.org/software/parallel/man.html official documentation]. | ||
<!--T:2--> | |||
By default, <tt>parallel</tt> will run as many tasks as the number of cores on the node, therefore maximizing resource usage. You can change this behaviour using the option <tt>--jobs</tt> followed by the number of simultaneous tasks that Gnu Parallel should run. When a task finishes, a new task will automatically be started by <tt>parallel</tt>. | By default, <tt>parallel</tt> will run as many tasks as the number of cores on the node, therefore maximizing resource usage. You can change this behaviour using the option <tt>--jobs</tt> followed by the number of simultaneous tasks that Gnu Parallel should run. When a task finishes, a new task will automatically be started by <tt>parallel</tt>. | ||
== Basic Usage == | == Basic Usage == <!--T:3--> | ||
Parallel uses curly brackets <tt>{}</tt> as parameters for the command to be run. For example, to run <tt>gzip</tt> on all the text files in a directory, you can execute | Parallel uses curly brackets <tt>{}</tt> as parameters for the command to be run. For example, to run <tt>gzip</tt> on all the text files in a directory, you can execute | ||
{{Command|ls *.txt {{!}} parallel gzip {{(}}{{)}} }} | {{Command|ls *.txt {{!}} parallel gzip {{(}}{{)}} }} | ||
<!--T:4--> | |||
An alternative syntax is to use <tt>:::</tt>, such as this example: | An alternative syntax is to use <tt>:::</tt>, such as this example: | ||
{{Command | {{Command | ||
Line 19: | Line 21: | ||
}} | }} | ||
<!--T:5--> | |||
Note that Gnu Parallel refers to each of the commands executed as <it>jobs</it>. This can be confusing because on many Compute Canada systems, a job is a batch script run by a scheduler or resource manager, and Gnu Parallel would be used inside that job. From that perspective, Gnu Parallel's jobs are <it>sub-jobs</it>. | Note that Gnu Parallel refers to each of the commands executed as <it>jobs</it>. This can be confusing because on many Compute Canada systems, a job is a batch script run by a scheduler or resource manager, and Gnu Parallel would be used inside that job. From that perspective, Gnu Parallel's jobs are <it>sub-jobs</it>. | ||
== Multiple Arguments == | == Multiple Arguments == <!--T:6--> | ||
You can also use multiple arguments by enumerating them, for example: | You can also use multiple arguments by enumerating them, for example: | ||
{{Command | {{Command | ||
Line 34: | Line 37: | ||
}} | }} | ||
== File Content as Argument List == | == File Content as Argument List == <!--T:7--> | ||
The syntax <tt>::::</tt> takes the content of a file to generate the list of values for the arguments. For example, if you have a list of parameter values in the file <tt>mylist.txt</tt>, you may display its content with: | The syntax <tt>::::</tt> takes the content of a file to generate the list of values for the arguments. For example, if you have a list of parameter values in the file <tt>mylist.txt</tt>, you may display its content with: | ||
{{Command|parallel echo {{(}}1{{)}} :::: mylist.txt}} | {{Command|parallel echo {{(}}1{{)}} :::: mylist.txt}} | ||
== File Content as Command List == | == File Content as Command List == <!--T:8--> | ||
Gnu parallel can also interpret the lines of a file as the actual sub-jobs to be run in parallel, by using redirection. For example, if you have a list of sub-jobs in the file <tt>mycommands.txt</tt> (one per line), you may run them in parallel as follows: | Gnu parallel can also interpret the lines of a file as the actual sub-jobs to be run in parallel, by using redirection. For example, if you have a list of sub-jobs in the file <tt>mycommands.txt</tt> (one per line), you may run them in parallel as follows: | ||
{{Command|parallel < mycommands.txt}} | {{Command|parallel < mycommands.txt}} | ||
<!--T:9--> | |||
Note that there is no command-argument given to parallel. This usage mode can be particularly useful if the sub-jobs contain symbols that are special to gnu parallel, or the sub-command are to contain a few commands (e.g. <tt>cd dir1 && ./executable</tt>). | Note that there is no command-argument given to parallel. This usage mode can be particularly useful if the sub-jobs contain symbols that are special to gnu parallel, or the sub-command are to contain a few commands (e.g. <tt>cd dir1 && ./executable</tt>). | ||
==Running on Multiple Nodes== | ==Running on Multiple Nodes== <!--T:10--> | ||
You can also use Gnu Parallel to distribute a workload across multiple nodes in a cluster, such as in the context of a job on a Compute Canada server. An example of this use is the following: | You can also use Gnu Parallel to distribute a workload across multiple nodes in a cluster, such as in the context of a job on a Compute Canada server. An example of this use is the following: | ||
{{Command | {{Command | ||
Line 51: | Line 55: | ||
In this case, we suppose that each node has 12 CPU cores and we will use the <tt>$PBS_NODEFILE</tt> file created automatically by the job scheduler to tell Gnu Parallel which nodes to use for the distribution of tasks. The <tt>--env</tt> allows us to transfer a named environment variable to all the nodes while the <tt>--workdir</tt> option ensures that the Gnu Parallel tasks will start in same directory as the main node. | In this case, we suppose that each node has 12 CPU cores and we will use the <tt>$PBS_NODEFILE</tt> file created automatically by the job scheduler to tell Gnu Parallel which nodes to use for the distribution of tasks. The <tt>--env</tt> allows us to transfer a named environment variable to all the nodes while the <tt>--workdir</tt> option ensures that the Gnu Parallel tasks will start in same directory as the main node. | ||
==Keeping Track of Completed and Failed Commands, and Restart Capabilities== | ==Keeping Track of Completed and Failed Commands, and Restart Capabilities== <!--T:11--> | ||
You can tell Gnu Parallel to keep track of which commands have completed by using the <tt>--joblog JOBLOGFILE</tt> argument. The file JOBLOGFILE will contain the list of completed commands, their start times, durations, hosts, and exit values. E.g. | You can tell Gnu Parallel to keep track of which commands have completed by using the <tt>--joblog JOBLOGFILE</tt> argument. The file JOBLOGFILE will contain the list of completed commands, their start times, durations, hosts, and exit values. E.g. | ||
{{Command|ls *.txt {{!}} parallel --joblog gzip.log gzip {{(}}{{)}} }} | {{Command|ls *.txt {{!}} parallel --joblog gzip.log gzip {{(}}{{)}} }} | ||
<!--T:12--> | |||
The job log functionality opens the door to a number of possible restart options. If the <tt>parallel</tt> command was interrupted (e.g. your job ran longer than the requested walltime of a job), you can make it pick up where it left off using the <tt>--resume</tt> option, e.g. | The job log functionality opens the door to a number of possible restart options. If the <tt>parallel</tt> command was interrupted (e.g. your job ran longer than the requested walltime of a job), you can make it pick up where it left off using the <tt>--resume</tt> option, e.g. | ||
{{Command|ls *.txt {{!}} parallel --resume --joblog gzip.log gzip {{(}}{{)}} }} | {{Command|ls *.txt {{!}} parallel --resume --joblog gzip.log gzip {{(}}{{)}} }} | ||
The new jobs will be appended to the old log file. | The new jobs will be appended to the old log file. | ||
<!--T:13--> | |||
If some of the subcommands failed (i.e., they produced a non-zero exit code), and you have think that you have eliminated the source of the error, you can re-run the failed ones, using the <tt>--resume-failed</tt>, e.g. | If some of the subcommands failed (i.e., they produced a non-zero exit code), and you have think that you have eliminated the source of the error, you can re-run the failed ones, using the <tt>--resume-failed</tt>, e.g. | ||
{{Command|ls *.txt {{!}} parallel --resume-failed --joblog gzip.log gzip {{(}}{{)}} }} | {{Command|ls *.txt {{!}} parallel --resume-failed --joblog gzip.log gzip {{(}}{{)}} }} | ||
(Note that this will also start subjobs that were not considered before). | (Note that this will also start subjobs that were not considered before). | ||
</translate> | </translate> |