Best practices for job submission: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 87: Line 87:


<!--T:16-->
<!--T:16-->
* A further complication with parallel execution concerns '''the use of multiple nodes''' - the software you are running must support ''distributed memory parallelism''.
* A further complication with parallel execution concerns <b>the use of multiple nodes</b> - the software you are running must support ''distributed memory parallelism''.
** Most software able to run over more than one node uses '''the [[MPI]] standard''', so if the documentation doesn't mention MPI or consistently refers to threading and thread-based parallelism, this likely means you will need to restrict yourself to a single node.
** Most software able to run over more than one node uses <b>the [[MPI]] standard</b>, so if the documentation doesn't mention MPI or consistently refers to threading and thread-based parallelism, this likely means you will need to restrict yourself to a single node.
** Programs that have been parallelized to run across multiple nodes '''should be started using''' <tt>srun</tt> rather than <tt>mpirun</tt>.  
** Programs that have been parallelized to run across multiple nodes <b>should be started using</b> <code>srun</code> rather than <code>mpirun</code>.  


<!--T:17-->
<!--T:17-->
* A goal should also be to '''avoid scattering your parallel processes across more nodes than is necessary''': a more compact distribution will usually help your job's performance.
* A goal should also be to <b>avoid scattering your parallel processes across more nodes than is necessary</b>: a more compact distribution will usually help your job's performance.
** Highly fragmented parallel jobs often exhibit poor performance and also make the scheduler's job more complicated. This being the case, you should try to submit jobs where the number of parallel processes is equal to an integral multiple of the number of cores per node, assuming this is compatible with the parallel software your jobs run.
** Highly fragmented parallel jobs often exhibit poor performance and also make the scheduler's job more complicated. This being the case, you should try to submit jobs where the number of parallel processes is equal to an integral multiple of the number of cores per node, assuming this is compatible with the parallel software your jobs run.
** So on a cluster with 40 cores/node, you would always submit parallel jobs asking for 40, 80, 120, 160, 240, etc. processes. For example, with the following job script header, all 120 MPI processes would be assigned in the most compact fashion, using three whole nodes.
** So on a cluster with 40 cores/node, you would always submit parallel jobs asking for 40, 80, 120, 160, 240, etc. processes. For example, with the following job script header, all 120 MPI processes would be assigned in the most compact fashion, using three whole nodes.
rsnt_translations
57,772

edits

Navigation menu