Bureaucrats, cc_docs_admin, cc_staff
2,879
edits
(contributions from Juan Z-A) |
|||
Line 32: | Line 32: | ||
=== Hybrid jobs: MPI and OpenMP, or MPI and threads === | === Hybrid jobs: MPI and OpenMP, or MPI and threads === | ||
--ntasks=16 | |||
--cpus-per-task=4 | |||
--mem-per-cpu=3G | |||
srun application.exe | |||
In this example a total of 64 cores will be allocated, but only 16 MPI processes (tasks) can and will be initialized. If the application is also OpenMP, then each process will spawn 4 threads, one per core. Each process will be allocated with 12GB of memory. The tasks, in groups of 4 cores each, could be allocated anywhere, from 2 to up to 16 nodes. | |||
--nodes=4 | |||
--tasks-per-node=4 | |||
--cpus-per-task=4 | |||
--mem=48G | |||
srun application.exe | |||
This job is the same size as the one immediately above: 16 tasks (that is, 16 MPI processes), each with 4 threads. The difference here is that we are sure of getting exactly 4 tasks on 4 different nodes. Recall that <code>--mem</code> requests memory ''per node'', so we can use it instead of <code>--mem-per-cpu</code> for the reason described earlier. | |||
=== MPI and GPUs === | === MPI and GPUs === |