What is a scheduler?: Difference between revisions
(Marked this version for translation) |
No edit summary |
||
(12 intermediate revisions by 3 users not shown) | |||
Line 3: | Line 3: | ||
== What's a job? == <!--T:1--> | == What's a job? == <!--T:1--> | ||
On computers we are most often familiar with graphical user interfaces (GUIs). There are windows, menus, buttons; we click here and there and the system responds. On | On computers, we are most often familiar with graphical user interfaces (GUIs). There are windows, menus, buttons; we click here and there and the system responds. On our servers, the environment is different. To begin with, you control it by typing, not clicking. This is called a [[Linux introduction|command line interface]]. Furthermore, a program you would like to run may not begin immediately, but may instead be put on a waiting list. When the necessary CPU cores are available it will begin, otherwise jobs would interfere with each other leading to performance loss. | ||
<!--T:2--> | <!--T:2--> | ||
You prepare a small text file called a | You prepare a small text file called a <i>job script</i> that basically says what program to run, where to get the input, and where to put the output. You <i>submit</i> this job script to a piece of software called the <i>scheduler</i> which decides when and where it will run. Once the job has finished, you can retrieve the results of the calculation. Normally there is no interaction between you and the program while the job is running, although you can check on its progress if you wish. | ||
<!--T:3--> | <!--T:3--> | ||
Line 19: | Line 19: | ||
sleep 30 | sleep 30 | ||
}} | }} | ||
It runs the programs <code>echo</code> and <code>sleep</code>, there is no input, and the output will go to a default location. Lines starting with <code>#SBATCH</code> are directives to the scheduler, | It runs the programs <code>echo</code> and <code>sleep</code>, there is no input, and the output will go to a default location. Lines starting with <code>#SBATCH</code> are directives to the scheduler, providing information about what the job needs to run. This job, for example, only needs one minute of run time (00:01:00). | ||
== The job scheduler == <!--T:4--> | == The job scheduler == <!--T:4--> | ||
Line 31: | Line 31: | ||
<!--T:5--> | <!--T:5--> | ||
On | On our clusters, these responsibilities are handled by the [https://en.wikipedia.org/wiki/Slurm_Workload_Manager Slurm Workload Manager]. All the examples and syntax shown on this page are for Slurm. | ||
== Requesting resources == <!--T:6--> | == Requesting resources == <!--T:6--> | ||
You use the job script to ask for the resources needed to run your calculation. Among the resources associated with a job are | You use the job script to ask for the resources needed to run your calculation. Among the resources associated with a job are <i>time</i> and <i>number of processors</i>. In the example above, the time requested is one minute and there will be one processor allocated by default since no specific number is given. Please refer to [[Running jobs#Examples_of_job_scripts|Examples of job scripts]] for other types of requests such as multiple processors, memory capacity and special processors such as [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units GPUs]. | ||
<!--T:7--> | <!--T:7--> | ||
It is important to specify those parameters well. If you ask for less than the calculation needs, the job will be killed for exceeding the requested time or memory limit. If you ask for more than it needs, the job may wait longer than necessary before it starts, and once running it will needlessly prevent others from using those resources. | It is important to specify those parameters well. If you ask for less than the calculation needs, the job will be killed for exceeding the requested time or memory limit. If you ask for more than it needs, the job may wait longer than necessary before it starts, and once running it will needlessly prevent others from using those resources. | ||
==A basic | ==A basic Slurm job== <!--T:8--> | ||
<!--T:9--> | <!--T:9--> | ||
Line 46: | Line 46: | ||
[someuser@host ~]$ sbatch simple_job.sh | [someuser@host ~]$ sbatch simple_job.sh | ||
Submitted batch job 1234 | Submitted batch job 1234 | ||
[someuser@host ~]$ | [someuser@host ~]$ sq | ||
JOBID USER ACCOUNT NAME ST TIME_LEFT NODES CPUS GRES MIN_MEM NODELIST (REASON) | |||
1234 someuser def-someprof simple_j R 0:33 1 1 (null) 256M blg9876 (None) | |||
[someuser@host ~]$ cat slurm-1234.out | [someuser@host ~]$ cat slurm-1234.out | ||
Hello, world! | Hello, world! | ||
Line 54: | Line 54: | ||
<!--T:10--> | <!--T:10--> | ||
Look at the ST column in the output of [[Running_jobs#Monitoring_jobs | | Look at the ST column in the output of [[Running_jobs#Monitoring_jobs | sq]] to determine the status of your jobs. The two most common states are PD for <i>pending</i> and R for <i>running</i>. When the job has finished, it no longer appears in the <code>sq</code> output. | ||
<!--T:11--> | <!--T:11--> | ||
Notice that each job is assigned a | Notice that each job is assigned a <i>job ID</i>, a unique identification number printed when you submit the job --- 1234 in this example. You can have more than one job in the system at a time, and the ID number can be used to distinguish them even if they have the same name. And finally, because we didn't specify anywhere else to put it the output is placed in a file named with the same job ID number, <code>slurm‑1234.out</code>. | ||
<!--T:12--> | <!--T:12--> | ||
Line 66: | Line 66: | ||
==Choosing where the output goes== <!--T:13--> | ==Choosing where the output goes== <!--T:13--> | ||
If you want the output file to have a more distinctive name than <code>slurm‑1234.out</code>, you can use <code>--output</code> to change it. | If you want the output file to have a more distinctive name than <code>slurm‑1234.out</code>, you can use <code>--output</code> to change it. | ||
The following script sets a | The following script sets a <i>job name</i> which will appear in the <code>squeue</code> output, and sends the output to a file prefixed with the job name and containing the job ID number, for exemple <i>test-1234.out</i>. | ||
<!--T:14--> | <!--T:14--> | ||
Line 84: | Line 84: | ||
==Accounts and projects== <!--T:16--> | ==Accounts and projects== <!--T:16--> | ||
Information about your job, like how long it waited, how long it ran, and how many cores it used, is recorded so we can monitor our quality of service and so we can report to our funders how their money is spent. Every job must have an associated | Information about your job, like how long it waited, how long it ran, and how many cores it used, is recorded so we can monitor our quality of service and so we can report to our funders how their money is spent. Every job must have an associated <i>account name</i> corresponding to a [[Frequently Asked Questions about the CCDB#Resource_Allocation_Projects_(RAP)| resource allocation project]]. | ||
<!--T:17--> | <!--T:17--> | ||
Line 90: | Line 90: | ||
<!--T:18--> | <!--T:18--> | ||
If you try to submit a job with <code>sbatch</code> without supplying an account name, and one is needed, you will be shown a list of valid account names to | If you try to submit a job with <code>sbatch</code> without supplying an account name, and one is needed, you will be shown a list of valid account names to choose from. | ||
</translate> | </translate> |
Latest revision as of 21:53, 20 July 2023
What's a job?
On computers, we are most often familiar with graphical user interfaces (GUIs). There are windows, menus, buttons; we click here and there and the system responds. On our servers, the environment is different. To begin with, you control it by typing, not clicking. This is called a command line interface. Furthermore, a program you would like to run may not begin immediately, but may instead be put on a waiting list. When the necessary CPU cores are available it will begin, otherwise jobs would interfere with each other leading to performance loss.
You prepare a small text file called a job script that basically says what program to run, where to get the input, and where to put the output. You submit this job script to a piece of software called the scheduler which decides when and where it will run. Once the job has finished, you can retrieve the results of the calculation. Normally there is no interaction between you and the program while the job is running, although you can check on its progress if you wish.
Here's a very simple job script:
#!/bin/bash
#SBATCH --time=00:01:00
echo 'Hello, world!'
sleep 30
It runs the programs echo
and sleep
, there is no input, and the output will go to a default location. Lines starting with #SBATCH
are directives to the scheduler, providing information about what the job needs to run. This job, for example, only needs one minute of run time (00:01:00).
The job scheduler
The job scheduler is a piece of software with multiple responsibilities. It must
- maintain a database of jobs,
- enforce policies regarding limits and priorities,
- ensure resources are not overloaded, for example by only assigning each CPU core to one job at a time,
- decide which jobs to run and on which compute nodes,
- launch them on those nodes, and
- clean up after each job finishes.
On our clusters, these responsibilities are handled by the Slurm Workload Manager. All the examples and syntax shown on this page are for Slurm.
Requesting resources
You use the job script to ask for the resources needed to run your calculation. Among the resources associated with a job are time and number of processors. In the example above, the time requested is one minute and there will be one processor allocated by default since no specific number is given. Please refer to Examples of job scripts for other types of requests such as multiple processors, memory capacity and special processors such as GPUs.
It is important to specify those parameters well. If you ask for less than the calculation needs, the job will be killed for exceeding the requested time or memory limit. If you ask for more than it needs, the job may wait longer than necessary before it starts, and once running it will needlessly prevent others from using those resources.
A basic Slurm job
We can submit the job script simple_job.sh
shown above with sbatch:
[someuser@host ~]$ sbatch simple_job.sh
Submitted batch job 1234
[someuser@host ~]$ sq
JOBID USER ACCOUNT NAME ST TIME_LEFT NODES CPUS GRES MIN_MEM NODELIST (REASON)
1234 someuser def-someprof simple_j R 0:33 1 1 (null) 256M blg9876 (None)
[someuser@host ~]$ cat slurm-1234.out
Hello, world!
Look at the ST column in the output of sq to determine the status of your jobs. The two most common states are PD for pending and R for running. When the job has finished, it no longer appears in the sq
output.
Notice that each job is assigned a job ID, a unique identification number printed when you submit the job --- 1234 in this example. You can have more than one job in the system at a time, and the ID number can be used to distinguish them even if they have the same name. And finally, because we didn't specify anywhere else to put it the output is placed in a file named with the same job ID number, slurm‑1234.out
.
You can also specify options to sbatch
on the command line. So for example,
[someuser@host ~]$ sbatch --time=00:30:00 simple_job.sh
will change the time limit of the job to 30 minutes. Any option can be overridden in this way.
Choosing where the output goes
If you want the output file to have a more distinctive name than slurm‑1234.out
, you can use --output
to change it.
The following script sets a job name which will appear in the squeue
output, and sends the output to a file prefixed with the job name and containing the job ID number, for exemple test-1234.out.
#!/bin/bash
#SBATCH --time=00:01:00
#SBATCH --job-name=test
#SBATCH --output=test-%J.out
echo 'Hello, world!'
Error output will normally appear in the same file, just as it would if you were typing commands interactively. If you wish you can split the standard error channel (stderr) from the standard output channel (stdout) by specifying a file name with the ‑e
option.
Accounts and projects
Information about your job, like how long it waited, how long it ran, and how many cores it used, is recorded so we can monitor our quality of service and so we can report to our funders how their money is spent. Every job must have an associated account name corresponding to a resource allocation project.
#SBATCH --account=def-user-ab
If you try to submit a job with sbatch
without supplying an account name, and one is needed, you will be shown a list of valid account names to choose from.