Job arrays: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
No edit summary
(Reorganize, emphasize scheduler load, offer alternatives for short tasks)
Line 3: Line 3:
''Parent page: [[Running jobs]]''
''Parent page: [[Running jobs]]''


A large number of tasks which differ only in some parameter can be conveniently submitted as a ''job array,'' also known as a ''task array'' or an ''array job''. The individual tasks in the array are distinguished by an environment variable, <code>$SLURM_ARRAY_TASK_ID</code>, which Slurm sets to different values according to the range you supply with the --array parameter.
If your work consists of a large number of tasks which differ only in some parameter, you can conveniently submit many tasks at once using a ''job array,'' also known as a ''task array'' or an ''array job''. The individual tasks in the array are distinguished by an environment variable, <code>$SLURM_ARRAY_TASK_ID</code>, which Slurm sets to a different value for each task. You set the range of values with the <code>--array</code> parameter.
 
See [https://slurm.schedmd.com/job_array.html Job Array Support] at SchedMD.com for detailed documentation.
   
   
sbatch --array=0-7 ...      # $SLURM_ARRAY_TASK_ID will take values from 0 to 7 inclusive
== Examples of the --array parameter ==
sbatch --array=1,3,5,7 ...  # $SLURM_ARRAY_TASK_ID will take the listed values
sbatch --array=1-7:2 ...    # Step-size of 2, does the same as the previous example
sbatch --array=1-100%10 ... # Allow no more than 10 of the jobs to run simultaneously


See [https://slurm.schedmd.com/job_array.html Job Array Support] at SchedMD.com for detailed documentation.
sbatch --array=0-7      # $SLURM_ARRAY_TASK_ID takes values from 0 to 7 inclusive
sbatch --array=1,3,5,7  # $SLURM_ARRAY_TASK_ID takes the listed values
sbatch --array=1-7:2    # Step-size of 2, same as the previous example
sbatch --array=1-100%10  # Allows no more than 10 of the jobs to run simultaneously


== A simple example ==
== A simple example ==


$ sbatch --array=1-10 runme
{{File
Submitted batch job 54321
|name=simple_array.sh
|language=bash
|contents=
#SBATCH --array=1-10
#SBATCH --time=3:00:00
program_x <input.$SLURM_ARRAY_TASK_ID
program_y $SLURM_ARRAY_TASK_ID some_arg another_arg
}}


Job 54321 will be scheduled as 10 independent tasks which may start at different times on different hosts. Each task has a different value of an environment variable $SLURM_ARRAY_TASK_ID. The script can reference $SLURM_ARRAY_TASK_ID to select an input file, for example, or to set a command-line argument for the application code:
This job will be scheduled as ten independent tasks. Each task has a separate time limit of 3 hours, and each may start at a different time on a different host.  


my_app <input.$SLURM_ARRAY_TASK_ID
The script references $SLURM_ARRAY_TASK_ID to select an input file, for example ("program_x"), or to set a command-line argument for the application ("program_y").
my_app $SLURM_ARRAY_TASK_ID some_arg another_arg


Using a job array instead of a large number of separate serial jobs has advantages for you and other users. A waiting job array only produces one line of output in squeue, making it easier for you to read its output. The scheduler does not have to analyze job requirements for each array task separately, so it can run more efficiently too.
Using a job array instead of a large number of separate serial jobs has advantages for you and other users. A waiting job array only produces one line of output in squeue, making it easier for you to read its output. The scheduler does not have to analyze job requirements for each array task separately, so it can run more efficiently too.


== Running the same script in multiple directories ==
Note that, other than the initial job-submission step with <code>sbatch</code>, the load on the scheduler is the same for an array job as for the equivalent number of non-array jobs. The cost of dispatching each array task is the same as dispatching a non-array job. You should not use a job array to submit tasks with very short run times, e.g. much less than an hour. Tasks with run times of only a few minutes should be grouped into longer jobs using [[Glost]], [[GNU Parallel]], or a shell loop.
This example assumes that you have multiple directories, each with the same structure and you want to run the same script in each directory. If the directories can be named with sequential numbers then the example above can be easily adapted. If they are not so systematic, then create a file with the names of the directories, like so:
 
== Example: Multiple directories ==
 
Suppose you have multiple directories, each with the same structure, and you want to run the same script in each directory. If the directories can be named with sequential numbers then the example above can be easily adapted. If the names are not so systematic, then create a file with the names of the directories, like so:


  $ cat case_list
  $ cat case_list
Line 32: Line 43:
  atlantic2016
  atlantic2016
  atlantic2017
  atlantic2017
There are several ways to select a given line from a file; this example uses <code>sed</code> to do so:


{{File
{{File
Line 38: Line 51:
|contents=
|contents=
#!/bin/bash
#!/bin/bash
#SBATCH --time=0:15:00
#SBATCH --time=3:00:00
#SBATCH --array=1-4
#SBATCH --array=1-4


echo "Starting task $SLURM_ARRAY_TASK_ID at $(date)"
echo "Starting task $SLURM_ARRAY_TASK_ID"
DIR=$(sed -n "${SLURM_ARRAY_TASK_ID}p" case_list)
DIR=$(sed -n "${SLURM_ARRAY_TASK_ID}p" case_list)
cd $DIR
cd $DIR
Line 51: Line 64:


Cautions:
Cautions:
* You should take care that the number of tasks you request matches the number of entries in the file.  
* Take care that the number of tasks you request matches the number of entries in the file.  
* The file <code>case_list</code> should not be changed until all the tasks in the array have run, since it will be read each time a new task starts.
* The file <code>case_list</code> should not be changed until all the tasks in the array have run, since it will be read each time a new task starts.

Revision as of 15:53, 28 June 2018


This article is a draft

This is not a complete article: This is a draft, a work in progress that is intended to be published into an article, which may or may not be ready for inclusion in the main wiki. It should not necessarily be considered factual or authoritative.




Parent page: Running jobs

If your work consists of a large number of tasks which differ only in some parameter, you can conveniently submit many tasks at once using a job array, also known as a task array or an array job. The individual tasks in the array are distinguished by an environment variable, $SLURM_ARRAY_TASK_ID, which Slurm sets to a different value for each task. You set the range of values with the --array parameter.

See Job Array Support at SchedMD.com for detailed documentation.

Examples of the --array parameter

sbatch --array=0-7       # $SLURM_ARRAY_TASK_ID takes values from 0 to 7 inclusive
sbatch --array=1,3,5,7   # $SLURM_ARRAY_TASK_ID takes the listed values
sbatch --array=1-7:2     # Step-size of 2, same as the previous example
sbatch --array=1-100%10  # Allows no more than 10 of the jobs to run simultaneously

A simple example

File : simple_array.sh

#SBATCH --array=1-10
#SBATCH --time=3:00:00
program_x <input.$SLURM_ARRAY_TASK_ID
program_y $SLURM_ARRAY_TASK_ID some_arg another_arg


This job will be scheduled as ten independent tasks. Each task has a separate time limit of 3 hours, and each may start at a different time on a different host.

The script references $SLURM_ARRAY_TASK_ID to select an input file, for example ("program_x"), or to set a command-line argument for the application ("program_y").

Using a job array instead of a large number of separate serial jobs has advantages for you and other users. A waiting job array only produces one line of output in squeue, making it easier for you to read its output. The scheduler does not have to analyze job requirements for each array task separately, so it can run more efficiently too.

Note that, other than the initial job-submission step with sbatch, the load on the scheduler is the same for an array job as for the equivalent number of non-array jobs. The cost of dispatching each array task is the same as dispatching a non-array job. You should not use a job array to submit tasks with very short run times, e.g. much less than an hour. Tasks with run times of only a few minutes should be grouped into longer jobs using Glost, GNU Parallel, or a shell loop.

Example: Multiple directories

Suppose you have multiple directories, each with the same structure, and you want to run the same script in each directory. If the directories can be named with sequential numbers then the example above can be easily adapted. If the names are not so systematic, then create a file with the names of the directories, like so:

$ cat case_list
pacific2016
pacific2017
atlantic2016
atlantic2017

There are several ways to select a given line from a file; this example uses sed to do so:


File : directories_array.sh

#!/bin/bash
#SBATCH --time=3:00:00
#SBATCH --array=1-4

echo "Starting task $SLURM_ARRAY_TASK_ID"
DIR=$(sed -n "${SLURM_ARRAY_TASK_ID}p" case_list)
cd $DIR

# Place the code to execute here
pwd
ls


Cautions:

  • Take care that the number of tasks you request matches the number of entries in the file.
  • The file case_list should not be changed until all the tasks in the array have run, since it will be read each time a new task starts.