Known issues: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 9: Line 9:
== Scheduler errors == <!--T:6-->
== Scheduler errors == <!--T:6-->
* The CC slurm configuration preferentially encourages whole-node jobs. Users should, if appropriate, request whole-node rather than per-core resources. See [[Job_scheduling_policies#Whole_nodes_versus_cores;|Job Scheduling - Whole Node Scheduling]] ([[User:Pjmann|Patrick Mann]] 20:15, 17 July 2017 (UTC))
* The CC slurm configuration preferentially encourages whole-node jobs. Users should, if appropriate, request whole-node rather than per-core resources. See [[Job_scheduling_policies#Whole_nodes_versus_cores;|Job Scheduling - Whole Node Scheduling]] ([[User:Pjmann|Patrick Mann]] 20:15, 17 July 2017 (UTC))
** Cpu and Gpu backfill partitions have been created on both clusters. If a job is submitted with <24hr runtime, it will be automatically entered into the cluster-wide backfill partition. This partition has a low priority, but will allow increased utilization of the cluster by serial jobs. ([[User:Nathanw|Nathan Wielenga]])
* Cpu and Gpu backfill partitions have been created on both clusters. If a job is submitted with <24hr runtime, it will be automatically entered into the cluster-wide backfill partition. This partition has a low priority, but will allow increased utilization of the cluster by serial jobs. ([[User:Nathanw|Nathan Wielenga]])
* SLURM epilog does not fully clean up processes from ended jobs, especially if the job did not exit normally.    ([[User:Gbnewby|Greg Newby]]) Fri Jul 14 19:32:48 UTC 2017)
* SLURM epilog does not fully clean up processes from ended jobs, especially if the job did not exit normally.    ([[User:Gbnewby|Greg Newby]]) Fri Jul 14 19:32:48 UTC 2017)
** This has been greatly improved after the addition of the epilog.clean script, but there are still nodes occasionally marked down for epilog failure. (NW)
** This has been greatly improved after the addition of the epilog.clean script, but there are still nodes occasionally marked down for epilog failure. (NW)
cc_staff
176

edits