Best practices for job submission: Difference between revisions

Best practices for job submission (view source)

Revision as of 19:25, 17 July 2023

10 bytes added , 1 year ago

no edit summary

Diane27

rsnt_translations

57,772

edits

@@ Line 64: / Line 64: @@
 <!--T:12-->
-* Your <code>Memory Efficiency</code> in the output from the <code>seff</code> command '''should be at least 80% to 85%''' in most cases.
+* Your <code>Memory Efficiency</code> in the output from the <code>seff</code> command <b>should be at least 80% to 85%</b> in most cases.
 ** Much like with the duration of your job, the goal when requesting the memory is to ensure that the amount is sufficient, with a certain margin of error.
-* If you plan on using a '''whole node''' for your job, it is natural to also '''use all of its available memory''' which you can express using the line <code>#SBATCH --mem=0</code> in your job submission script.
+* If you plan on using a <b>whole node</b> for your job, it is natural to also <b>use all of its available memory</b> which you can express using the line <code>#SBATCH --mem=0</code> in your job submission script.
 ** Note however that most of our clusters offer nodes with variable amounts of memory available, so using this approach means your job will likely be assigned a node with less memory.
-* If your testing has shown that you need a '''large memory node''', then you will want to use a line like <code>#SBATCH --mem=1500G</code> for example, to request a node with 1500 GB (or 1.46 TB) of memory.
+* If your testing has shown that you need a <b>large memory node</b>, then you will want to use a line like <code>#SBATCH --mem=1500G</code> for example, to request a node with 1500 GB (or 1.46 TB) of memory.
 ** There are relatively few of these large memory nodes so your job will wait much longer to run - make sure your job really needs all this extra memory.
@@ Line 74: / Line 74: @@
 <!--T:14-->
-* By default your job will get one core on one node and this is the most sensible policy because '''most software is serial''': it can only ever make use of a single core.
+* By default your job will get one core on one node and this is the most sensible policy because <b>most software is serial</b>: it can only ever make use of a single core.
 ** Asking for more cores and/or nodes will not make the serial program run any faster because for it to run in parallel the program's source code needs to be modified, in some cases in a very profound manner requiring a substantial investment of developer time.
-* How can you '''determine if''' the software you're using '''can run in parallel'''?
+* How can you <b>determine if</b> the software you're using <b>can run in parallel</b>?
-** The best approach is to '''look in the software's documentation''' for a section on parallel execution: if you can't find anything, this is usually a sign that this program is serial.
+** The best approach is to <b>look in the software's documentation</b> for a section on parallel execution: if you can't find anything, this is usually a sign that this program is serial.
-** You can also '''contact the development team''' to ask if the software can be run in parallel and if not, to request that such a feature be added in a future version.
+** You can also <b>contact the development team</b> to ask if the software can be run in parallel and if not, to request that such a feature be added in a future version.
 <!--T:15-->
-* If the program can run in parallel, the next question is '''what number of cores to use'''?
+* If the program can run in parallel, the next question is <b>what number of cores to use</b>?
 ** Many of the programming techniques used to allow a program to run in parallel assume the existence of a ''shared memory environment'', i.e. multiple cores can be used but they must all be located on the same node. In this case, the maximum number of cores available on a single node provides a ceiling for the number of cores you can use.
 ** It may be tempting to simply request "as many cores as possible", but this is often not the wisest approach. Just as having too many cooks trying to work together in a small kitchen to prepare a single meal can lead to chaos, so too adding an excessive number of CPU cores can have the perverse effect of slowing down a program.

Best practices for job submission: Difference between revisions

Best practices for job submission (view source)

Revision as of 19:25, 17 July 2023

Navigation menu

Search