rsnt_translations
56,430
edits
No edit summary |
No edit summary |
||
Line 64: | Line 64: | ||
<!--T:12--> | <!--T:12--> | ||
* Your <code>Memory Efficiency</code> in the output from the <code>seff</code> command | * Your <code>Memory Efficiency</code> in the output from the <code>seff</code> command <b>should be at least 80% to 85%</b> in most cases. | ||
** Much like with the duration of your job, the goal when requesting the memory is to ensure that the amount is sufficient, with a certain margin of error. | ** Much like with the duration of your job, the goal when requesting the memory is to ensure that the amount is sufficient, with a certain margin of error. | ||
* If you plan on using a | * If you plan on using a <b>whole node</b> for your job, it is natural to also <b>use all of its available memory</b> which you can express using the line <code>#SBATCH --mem=0</code> in your job submission script. | ||
** Note however that most of our clusters offer nodes with variable amounts of memory available, so using this approach means your job will likely be assigned a node with less memory. | ** Note however that most of our clusters offer nodes with variable amounts of memory available, so using this approach means your job will likely be assigned a node with less memory. | ||
* If your testing has shown that you need a | * If your testing has shown that you need a <b>large memory node</b>, then you will want to use a line like <code>#SBATCH --mem=1500G</code> for example, to request a node with 1500 GB (or 1.46 TB) of memory. | ||
** There are relatively few of these large memory nodes so your job will wait much longer to run - make sure your job really needs all this extra memory. | ** There are relatively few of these large memory nodes so your job will wait much longer to run - make sure your job really needs all this extra memory. | ||
Line 74: | Line 74: | ||
<!--T:14--> | <!--T:14--> | ||
* By default your job will get one core on one node and this is the most sensible policy because | * By default your job will get one core on one node and this is the most sensible policy because <b>most software is serial</b>: it can only ever make use of a single core. | ||
** Asking for more cores and/or nodes will not make the serial program run any faster because for it to run in parallel the program's source code needs to be modified, in some cases in a very profound manner requiring a substantial investment of developer time. | ** Asking for more cores and/or nodes will not make the serial program run any faster because for it to run in parallel the program's source code needs to be modified, in some cases in a very profound manner requiring a substantial investment of developer time. | ||
* How can you | * How can you <b>determine if</b> the software you're using <b>can run in parallel</b>? | ||
** The best approach is to | ** The best approach is to <b>look in the software's documentation</b> for a section on parallel execution: if you can't find anything, this is usually a sign that this program is serial. | ||
** You can also | ** You can also <b>contact the development team</b> to ask if the software can be run in parallel and if not, to request that such a feature be added in a future version. | ||
<!--T:15--> | <!--T:15--> | ||
* If the program can run in parallel, the next question is | * If the program can run in parallel, the next question is <b>what number of cores to use</b>? | ||
** Many of the programming techniques used to allow a program to run in parallel assume the existence of a ''shared memory environment'', i.e. multiple cores can be used but they must all be located on the same node. In this case, the maximum number of cores available on a single node provides a ceiling for the number of cores you can use. | ** Many of the programming techniques used to allow a program to run in parallel assume the existence of a ''shared memory environment'', i.e. multiple cores can be used but they must all be located on the same node. In this case, the maximum number of cores available on a single node provides a ceiling for the number of cores you can use. | ||
** It may be tempting to simply request "as many cores as possible", but this is often not the wisest approach. Just as having too many cooks trying to work together in a small kitchen to prepare a single meal can lead to chaos, so too adding an excessive number of CPU cores can have the perverse effect of slowing down a program. | ** It may be tempting to simply request "as many cores as possible", but this is often not the wisest approach. Just as having too many cooks trying to work together in a small kitchen to prepare a single meal can lead to chaos, so too adding an excessive number of CPU cores can have the perverse effect of slowing down a program. |