Bureaucrats, cc_docs_admin, cc_staff
2,879
edits
No edit summary |
|||
Line 20: | Line 20: | ||
--mem-per-cpu=3G | --mem-per-cpu=3G | ||
srun application.exe | srun application.exe | ||
This will run 15 MPI processes. The cores could be allocated anywhere in the cluster. Since we don’t know | This will run 15 MPI processes. The cores could be allocated anywhere in the cluster. Since we don’t know in advance how many cores will reside on each node, if we want to specify memory, it should be done by per-cpu. | ||
If for some reason we need all cores in a single node (to avoid communication overhead, for example), then | If for some reason we need all cores in a single node (to avoid communication overhead, for example), then | ||
Line 28: | Line 28: | ||
srun application.exe | srun application.exe | ||
will give us what we need. In this case we could also say <code>--mem-per-cpu=3G</code>. The main difference is that with <code>--mem-per-cpu=3G</code>, the job will be canceled if any of the processes exceeds 3GB, while with <code>--mem=45G</code>, the memory consumed by each individual process doesn't matter, as long as all of them together don’t use more than 45GB. | will give us what we need. In this case we could also say <code>--mem-per-cpu=3G</code>. The main difference is that with <code>--mem-per-cpu=3G</code>, the job will be canceled if any of the processes exceeds 3GB, while with <code>--mem=45G</code>, the memory consumed by each individual process doesn't matter, as long as all of them together don’t use more than 45GB. | ||
=== Hybrid jobs: MPI and OpenMP, or MPI and threads === | === Hybrid jobs: MPI and OpenMP, or MPI and threads === | ||
Line 44: | Line 42: | ||
--mem=48G | --mem=48G | ||
srun application.exe | srun application.exe | ||
This job is the same size as the one | This job is the same size as the last one: 16 tasks (that is, 16 MPI processes), each with 4 threads. The difference here is that we are sure of getting exactly 4 tasks on each of 4 different nodes. Recall that <code>--mem</code> requests memory ''per node'', so we use it instead of <code>--mem-per-cpu</code> for the reason described earlier. | ||
=== MPI and GPUs === | === MPI and GPUs === |