Bureaucrats, cc_docs_admin, cc_staff, rsnt_translations
2,837
edits
No edit summary |
No edit summary |
||
Line 13: | Line 13: | ||
* Operations will occasionally time out with a message like "Socket timed out on send/recv operation" or "Unable to contact slurm controller (connect failure)". As a temporary workaround, attempt to resubmit your jobs/commands, they should go through in a few seconds. ([[User:Nathanw|Nathan Wielenga]]) 08:50, 18 July 2017 (MDT)) | * Operations will occasionally time out with a message like "Socket timed out on send/recv operation" or "Unable to contact slurm controller (connect failure)". As a temporary workaround, attempt to resubmit your jobs/commands, they should go through in a few seconds. ([[User:Nathanw|Nathan Wielenga]]) 08:50, 18 July 2017 (MDT)) | ||
** Should be resolved after a VHD migration to a new backend for slurmctl. (NW) | ** Should be resolved after a VHD migration to a new backend for slurmctl. (NW) | ||
* The environment of the shell in which a job was submitted is exported to the job. This can lead to irreproducible results. | |||
** Solution/workaround: Add the option <tt>#SBATCH --export=NONE</tt> to your job script. | |||
== Quota and filesystem problems == <!--T:7--> | == Quota and filesystem problems == <!--T:7--> |