Translations:Using node-local storage/11/en: Difference between revisions

From Alliance Doc
Jump to navigation Jump to search
m (FuzzyBot moved page Translations:Using local scratch/11/en to Translations:Using node-local storage/11/en without leaving a redirect: Part of translatable page "Using local scratch")
(Importing a new version from external source)
Line 2: Line 2:
job ends.  If a job times out, then the last few lines of the job script might not  
job ends.  If a job times out, then the last few lines of the job script might not  
be executed.  This can be addressed two ways:
be executed.  This can be addressed two ways:
* First, obviously, request enough run time to let the application finish.  We understand that this isn't always possible.
* First, obviously, request enough runtime to let the application finish, although we understand that this isn't always possible.
* Write [[Points_de_contrôle/en|checkpoints]] to network storage, not to <code>$SLURM_TMPDIR</code>.
* Write [[Points_de_contrôle/en|checkpoints]] to network storage, not to <code>$SLURM_TMPDIR</code>.

Revision as of 18:02, 15 March 2023

Information about message (contribute)
This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.
Message definition (Using node-local storage)
Output data must be copied from <code>$SLURM_TMPDIR</code> back to some permanent storage before the
job ends.  If a job times out, then the last few lines of the job script might not 
be executed.  This can be addressed three ways:
* request enough runtime to let the application finish, although we understand that this isn't always possible;
* write [[Points_de_contrôle/en|checkpoints]] to network storage, not to <code>$SLURM_TMPDIR</code>;
* write a signal trapping function.

Output data must be copied from $SLURM_TMPDIR back to some permanent storage before the job ends. If a job times out, then the last few lines of the job script might not be executed. This can be addressed two ways:

  • First, obviously, request enough runtime to let the application finish, although we understand that this isn't always possible.
  • Write checkpoints to network storage, not to $SLURM_TMPDIR.