Best practices for job submission: Difference between revisions

Jump to navigation Jump to search
Line 35: Line 35:


A further complication with parallel execution concerns the use of multiple nodes - many of the programming techniques used to allow a program to run in parallel assume the existence of a shared memory environment, i.e. multiple cores can be used but they must all be located on the same node. In this case, the maximum number of cores available on a single node provides a ceiling for the number of cores you can use. Trying to ask for more cores than this or using more than one node will fail because the software you are running does not support <i>distributed memory parallelism</i>. Most software able to run over more than one node uses the [[MPI]] standard, so if the documentation doesn't mention MPI or consistently refers to threading and thread-based parallelism, this likely means you will need to restrict yourself to a single node. Programs that have been parallelized to run across multiple nodes should be started using <tt>srun</tt> rather than <tt>mpirun</tt>.  
A further complication with parallel execution concerns the use of multiple nodes - many of the programming techniques used to allow a program to run in parallel assume the existence of a shared memory environment, i.e. multiple cores can be used but they must all be located on the same node. In this case, the maximum number of cores available on a single node provides a ceiling for the number of cores you can use. Trying to ask for more cores than this or using more than one node will fail because the software you are running does not support <i>distributed memory parallelism</i>. Most software able to run over more than one node uses the [[MPI]] standard, so if the documentation doesn't mention MPI or consistently refers to threading and thread-based parallelism, this likely means you will need to restrict yourself to a single node. Programs that have been parallelized to run across multiple nodes should be started using <tt>srun</tt> rather than <tt>mpirun</tt>.  
When using multiple nodes, a goal should also be to avoid scattering your parallel processes across more nodes than is necessary: a more compact distribution will usually help your job's performance. If you know the nodes of the cluster you're using have for example 40 cores, you can use syntax like
<source>
#SBATCH --nodes=3
#SBATCH --ntasks-per-node=40
</source>
to ensure that your 120 MPI processes will be assigned in the most compact fashion, using three whole nodes. Highly fragmented parallel jobs often exhibit poor performance and also make the scheduler's job more complicated. This being the case, you should try to submit jobs where the number of parallel processes is equal to an integral multiple of the number of cores per node,  assuming this is compatible with the parallel software your jobs run. So on a cluster with 40 cores/node, you would always submit parallel jobs asking for 40, 80, 120, 160, 240 etc. processes. 


Ultimately, the goal should be to ensure that the CPU efficiency of your jobs is very close to 100%, as measured by the field of this name in the output from the <tt>seff</tt> command; any value of CPU efficiency less than 90% is poor and means that your use of whatever software your job executes needs to be improved.
Ultimately, the goal should be to ensure that the CPU efficiency of your jobs is very close to 100%, as measured by the field of this name in the output from the <tt>seff</tt> command; any value of CPU efficiency less than 90% is poor and means that your use of whatever software your job executes needs to be improved.
Bureaucrats, cc_docs_admin, cc_staff
2,320

edits

Navigation menu