Bureaucrats, cc_docs_admin, cc_staff
2,879
edits
m (→Memory) |
(Marked this version for translation) |
||
Line 47: | Line 47: | ||
Please be cautious if you use a script to submit multiple Slurm jobs in a short time. Submitting thousands of jobs at a time can cause Slurm to become [[Frequently_Asked_Questions#.22sbatch:_error:_Batch_job_submission_failed:_Socket_timed_out_on_send.2Frecv_operation.22|unresponsive]] to other users. Consider using an [[Running jobs#Array job|array job]] instead, or use <code>sleep</code> to space out calls to <code>sbatch</code> by one second or more. | Please be cautious if you use a script to submit multiple Slurm jobs in a short time. Submitting thousands of jobs at a time can cause Slurm to become [[Frequently_Asked_Questions#.22sbatch:_error:_Batch_job_submission_failed:_Socket_timed_out_on_send.2Frecv_operation.22|unresponsive]] to other users. Consider using an [[Running jobs#Array job|array job]] instead, or use <code>sleep</code> to space out calls to <code>sbatch</code> by one second or more. | ||
=== Memory === | === Memory === <!--T:161--> | ||
<!--T:106--> | <!--T:106--> | ||
Memory may be requested with <code>--mem-per-cpu</code> (memory per core) or <code>--mem</code> (memory per node). On general-purpose (GP) clusters a default memory amount of 256 MB per core will be allocated unless you make some other request. At [[Niagara]] only whole nodes are allocated along with all available memory, so a memory specification is not required there. | Memory may be requested with <code>--mem-per-cpu</code> (memory per core) or <code>--mem</code> (memory per node). On general-purpose (GP) clusters a default memory amount of 256 MB per core will be allocated unless you make some other request. At [[Niagara]] only whole nodes are allocated along with all available memory, so a memory specification is not required there. | ||
<!--T:162--> | |||
A common source of confusion comes from the fact that some memory on a node is not available to the job (reserved for the OS, etc). The effect of this is that each node-type has a maximum amount available to jobs - for instance, nominally "128G" nodes are typically configured to permit 125G of memory to user jobs. If you request more memory than a node-type provides, your job will be constrained to run on higher-memory nodes, which may be fewer in number. | A common source of confusion comes from the fact that some memory on a node is not available to the job (reserved for the OS, etc). The effect of this is that each node-type has a maximum amount available to jobs - for instance, nominally "128G" nodes are typically configured to permit 125G of memory to user jobs. If you request more memory than a node-type provides, your job will be constrained to run on higher-memory nodes, which may be fewer in number. | ||
<!--T:163--> | |||
Adding to this confusion, Slurm interprets K, M, G, etc., as [https://en.wikipedia.org/wiki/Binary_prefix binary prefixes], so <code>--mem=125G</code> is equivalent to <code>--mem=128000M</code>. See the "available memory" column in the "Node types and characteristics" table for each GP cluster for the Slurm specification of the maximum memory you can request on each node: [[Béluga/en#Node_types_and_characteristics|Béluga]], [[Cedar#Node_types_and_characteristics|Cedar]], [[Graham#Node_types_and_characteristics|Graham]]. | Adding to this confusion, Slurm interprets K, M, G, etc., as [https://en.wikipedia.org/wiki/Binary_prefix binary prefixes], so <code>--mem=125G</code> is equivalent to <code>--mem=128000M</code>. See the "available memory" column in the "Node types and characteristics" table for each GP cluster for the Slurm specification of the maximum memory you can request on each node: [[Béluga/en#Node_types_and_characteristics|Béluga]], [[Cedar#Node_types_and_characteristics|Cedar]], [[Graham#Node_types_and_characteristics|Graham]]. | ||