Best practices for job submission: Difference between revisions
Line 4: | Line 4: | ||
For jobs which are not tests, the duration should be at least one hour. If your computation requires less than an hour, you should consider using tools like [[GLOST]], [[META]] or [[GNU Parallel]] to regroup several of your computations into a single Slurm job with a duration of at least an hour. Hundreds or thousands of very short jobs place undue stress on the scheduler. | For jobs which are not tests, the duration should be at least one hour. If your computation requires less than an hour, you should consider using tools like [[GLOST]], [[META]] or [[GNU Parallel]] to regroup several of your computations into a single Slurm job with a duration of at least an hour. Hundreds or thousands of very short jobs place undue stress on the scheduler. | ||
It is equally important that your estimate of the job duration be relatively accurate: asking for five days when the computation in reality finishes after just sixteen hours leads to your job spending much more time waiting to start than it would had you given a more accurate estimate of the duration. It's natural to leave a certain amount of room for error in the estimate and so to increase the duration by five or ten percent "just in case" but otherwise it's in your interest for your estimate of the job's duration to be as accurate as possible. You can see how long completed jobs took to run using the command <source>seff <jobid></source> | It is equally important that your estimate of the job duration be relatively accurate: asking for five days when the computation in reality finishes after just sixteen hours leads to your job spending much more time waiting to start than it would had you given a more accurate estimate of the duration. It's natural to leave a certain amount of room for error in the estimate and so to increase the duration by five or ten percent "just in case" but otherwise it's in your interest for your estimate of the job's duration to be as accurate as possible. You can see how long completed jobs took to run using the command <source>seff <jobid></source> in the field <i>Job Wall-clock time</i>. | ||
=Parallelism= | =Parallelism= | ||
=Memory consumption= | =Memory consumption= |
Revision as of 16:10, 29 August 2022
When submitting a job to one of the clusters, it's important to choose appropriate values for various parameters in order to ensure that your job doesn't waste resources or create problems for other users and yourself. This will ensure your job starts more quickly and that it is likely to finish correctly, producing the output you need to move your research forward.
Job duration
For jobs which are not tests, the duration should be at least one hour. If your computation requires less than an hour, you should consider using tools like GLOST, META or GNU Parallel to regroup several of your computations into a single Slurm job with a duration of at least an hour. Hundreds or thousands of very short jobs place undue stress on the scheduler.
It is equally important that your estimate of the job duration be relatively accurate: asking for five days when the computation in reality finishes after just sixteen hours leads to your job spending much more time waiting to start than it would had you given a more accurate estimate of the duration. It's natural to leave a certain amount of room for error in the estimate and so to increase the duration by five or ten percent "just in case" but otherwise it's in your interest for your estimate of the job's duration to be as accurate as possible. You can see how long completed jobs took to run using the command
seff <jobid>
in the field Job Wall-clock time.