Advanced MPI scheduling: Difference between revisions

Jump to navigation Jump to search
Line 25: Line 25:


=== Why srun instead of mpiexec or mpirun? ===
=== Why srun instead of mpiexec or mpirun? ===
To come
<code>mpirun</code> is a wrapper that enables communication between processes running on different machines. Modern schedulers already provide many things that <code>mpirun</code> needs. With Torque/Moab, for example, there is no need to pass to <code>mpirun</code> the list of nodes on which to run, or the number of processes to launch; this is done automatically by the scheduler. With Slurm, the task affinity is also resolved by the scheduler, so there is no need to specify things like
 
mpirun --map-by node:pe=4 -n 16  application.exe
 
As implied in the examples above, <code>srun application.exe</code> will automatically distribute the processes to precisely the resources allocated to the job.
 
In programming terminology, <code>srun</code> is higher level of abstraction than <code>mpirun</code>. Anything that can be done with <code>mpirun</code> can be done with <code>srun</code>, and more. It is the tool in Slurm to distribute any kind of computations. It replaces Torque’s <code>pbsdsh</code>, for example, and much more. Think of <code>srun</code> as the SLURM "all-around parallel-tasks distributor"; once a particular set of resources is allocated, the nature of your application doesn't matter (MPI, OpenMP, hybrid, serial farming, pipelining, multi-program, etc.), you just have to <code>srun</code> it
 
Also, as should be expected, <code>srun</code> is fully coupled to Slurm. When you <code>srun</code> an application, a "job step" is started, the environment variables <code>SLURM_STEP_ID</code> and <code>SLURM_PROCID</code> are initialized correctly, and correct accounting information is recorded.


=== External links ===
=== External links ===
Bureaucrats, cc_docs_admin, cc_staff
2,879

edits

Navigation menu