Translations:Using node-local storage/11/en

From Alliance Doc
Revision as of 18:02, 15 March 2023 by FuzzyBot (talk | contribs) (Importing a new version from external source)
Jump to navigation Jump to search

Output data must be copied from $SLURM_TMPDIR back to some permanent storage before the job ends. If a job times out, then the last few lines of the job script might not be executed. This can be addressed two ways:

  • First, obviously, request enough runtime to let the application finish, although we understand that this isn't always possible.
  • Write checkpoints to network storage, not to $SLURM_TMPDIR.