Advanced MPI scheduling: Difference between revisions

Jump to navigation Jump to search
updated whole node section
(Marked this version for translation)
(updated whole node section)
Line 35: Line 35:


<!--T:24-->
<!--T:24-->
Most nodes in [[Cedar]] and [[Graham]] have 32 cores and 128GB or more of memory. If you have a large parallel job to run, which can use 32 or a multiple of 32 cores efficiently, you should request whole nodes like so:
Typical nodes in [[Cedar]], [[Graham]] and [[Niagara]] have the following CPU and memory configuration:
--nodes=2
 
--ntasks-per-node=32
{| class="wikitable"
--mem=128000M
|-
srun application.exe
! Cluster                !! cores !! usable memory          !! notes
The above job will probably start sooner than an equivalent one that requests <code>--ntasks=64</code>.  
|-
| [[Graham]]             || 32    || 125 GiB (~3.9 GiB/core) || A fraction of these nodes are reserved for whole node jobs.
|-
| [[Cedar]] (Broadwell)  || 32   || 125 GiB (~3.9 GiB/core) ||
|-
| [[Cedar]] (Skylake)    || 48    || 187 GiB (~3.9 GiB/core) || A fraction of these nodes are reserved for whole node jobs.
|-
| [[Niagara]]            || 40    || 188 GiB                || Niagara nodes can only be requested as whole.
|}
 
If you have a large parallel job to run, which can use 32 or a multiple of 32 cores efficiently, you should request whole nodes like so:
 
<tabs>
<tab name="Graham">
{{File
  |name=whole_node_graham.sh
  |lang="sh"
  |contents=
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=32
#SBATCH --mem=128000M
srun application.exe
}}</tab>
<tab name="Cedar">
{{File
  |name=whole_node_graham.sh
  |lang="sh"
  |contents=#SBATCH --nodes=2
#SBATCH --ntasks-per-node=48
#SBATCH --mem=192000M
srun application.exe
}}</tab>
</tabs>
 
The above job will probably start sooner than an equivalent one that requests <code>--ntasks=64</code> as a fraction of the nodes is exclusively reserved for whole-node jobs.  


<!--T:19-->
<!--T:19-->
cc_staff
653

edits

Navigation menu