Known issues

From Alliance Doc
Revision as of 19:36, 14 July 2017 by Gbnewby (talk | contribs)
Jump to navigation Jump to search
Other languages:

Intro

Shared issues

  • SLURM epilog does not fully clean up processes from ended jobs, especially if the job did not exit normally. (Greg Newby) Fri Jul 14 19:32:48 UTC 2017)
  • Email from graham and cedar is still undergoing configuration. Therefore email job notifications from Slurm are failing. (Patrick Mann (talk) 17:17, 26 June 2017 (UTC))
    • Cedar email is working now (Patrick Mann (talk) 16:11, 6 July 2017 (UTC))
    • Graham email is working
  • The SLURM 'sinfo' command yields different resource-type detail on graham and cedar. (Greg Newby) 16:05, 23 June 2017 (UTC))
  • Local scratch on compute nodes has inconsistent naming. Cedar has /local and Graham has /localscratch.
  • The status page at http://status.computecanada.ca/ is not updated automatically yet, so does not necessarily show correct, current status.
  • "Nearline" capabilities are not yet available (see https://docs.computecanada.ca/wiki/National_Data_Cyberinfrastructure for a brief description of the intended functionality)

Cedar only

  • Environment variables such as $SCRATCH and $PROJECT are not yet set, although the filesystem are available. (Greg Newby) 16:10, 21 June 2017 (UTC))

Graham only

  • big memory nodes need to be added to the scheduler
  • no network topology information in the scheduler

Other issues