38,760
edits
(Updating to match new version of source page) |
(Updating to match new version of source page) |
||
Line 33: | Line 33: | ||
* SLURM operations will occasionally time out with a message like "Socket timed out on send/recv operation" or "Unable to contact slurm controller (connect failure)". As a temporary workaround, attempt to resubmit your jobs/commands, they should go through in a few seconds. ([[User:Nathanw|Nathan Wielenga]]) 08:50, 18 July 2017 (MDT)) | * SLURM operations will occasionally time out with a message like "Socket timed out on send/recv operation" or "Unable to contact slurm controller (connect failure)". As a temporary workaround, attempt to resubmit your jobs/commands, they should go through in a few seconds. ([[User:Nathanw|Nathan Wielenga]]) 08:50, 18 July 2017 (MDT)) | ||
** Should be resolved after a VHD migration to a new backend for slurmctl. (NW) | ** Should be resolved after a VHD migration to a new backend for slurmctl. (NW) | ||
*Some people are getting an error "error: Job submit/allocate failed: Invalid account or account/partition combination specified" | |||
**They need to specify '--account=<accounting group>' | |||
= Graham only = | = Graham only = | ||
Line 40: | Line 42: | ||
* Compute nodes cannot access Internet | * Compute nodes cannot access Internet | ||
** Solution: Request exceptions to be made at support@computecanada.ca Describe what you need to access and why. | ** Solution: Request exceptions to be made at support@computecanada.ca Describe what you need to access and why. | ||
* | |||
* Crontab does not work on Graham. When attempting adding a new item there is an error during saving: | |||
<pre> | |||
[rozmanov@gra-login1 ~]$ crontab -e | |||
no crontab for rozmanov - using an empty one | |||
crontab: installing new crontab | |||
/var/spool/cron/#tmp.gra-login1.XXXXKsp8LU: Read-only file system | |||
crontab: edits left in /tmp/crontab.u0ljzU | |||
</pre> | |||
Crontab does work on Cedar. So, there must be some kind of a common approach on CC system. | |||
Clearly, the main issue is how to handle user's crontabs on multiple login nodes. | |||
But it's not clear whether we even want to do so. | |||
= Other issues = | = Other issues = |