Translations:Using node-local storage/11/en: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
m (FuzzyBot moved page Translations:Using $SLURM TMPDIR/11/en to Translations:Usng local scratch/11/en without leaving a redirect: Part of translatable page "Using $SLURM TMPDIR")
m (FuzzyBot moved page Translations:Usng local scratch/11/en to Translations:Using local scratch/11/en without leaving a redirect: Part of translatable page "Usng local scratch")
(No difference)

Revision as of 15:54, 31 March 2021

Information about message (contribute)
This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.
Message definition (Using node-local storage)
Output data must be copied from <code>$SLURM_TMPDIR</code> back to some permanent storage before the
job ends.  If a job times out, then the last few lines of the job script might not 
be executed.  This can be addressed three ways:
* request enough runtime to let the application finish, although we understand that this isn't always possible;
* write [[Points_de_contrôle/en|checkpoints]] to network storage, not to <code>$SLURM_TMPDIR</code>;
* write a signal trapping function.

Output data must be copied from $SLURM_TMPDIR back to some permanent storage before the job ends. If a job times out, then the last few lines of the job script might not be executed. This can be addressed two ways:

  • First, obviously, request enough run time to let the application finish. We understand that this isn't always possible.
  • Write checkpoints to network storage, not to $SLURM_TMPDIR.