Frequently Asked Questions: Difference between revisions

Frequently Asked Questions (view source)

Revision as of 18:04, 14 March 2018

907 bytes added , 6 years ago

explain why projected START_TIME is useless

Rdickson

Bureaucrats, cc_docs_admin, cc_staff

2,879

edits

@@ Line 136: / Line 136: @@
 <!--T:21-->
 The <tt>LevelFS</tt> column gives you information about your over- or under-consumption of cluster resources: when <tt>LevelFS</tt> is greater than one, you are consuming fewer resources than your fair share, while if it is less than one you are consuming more. The more you overconsume resources, the closer the value gets to zero and the more your pending jobs decrease in priority. There is a memory effect to this calculation so the scheduler gradually "forgets" about any potential over- or under-consumption of resources from months past. Finally, note that the value of <tt>LevelFS</tt> is unique to the specific cluster.
+== How accurate is START_TIME in <tt>squeue</tt> output? ==
+Slurm projects a starting time for jobs which are high on its priority list but have not yet started. The projected time must be computed from current information, specifically:
+* what jobs are currently running, and when they will end, and
+* what jobs are currently waiting, and what their relative priorities are.
+Perforce, Slurm assumes
+* that time limits for running jobs are accurate, and
+* new jobs will not arrive and perturb the priority list.
+Both these assumptions are obviously wrong, the second one being perhaps the most problematic. On Compute Canada general purpose clusters, a new job is submitted on average about every five seconds, so any projection more than five seconds in the future is clearly at risk of changing.
+For jobs which are already running, the start time reported by <tt>squeue</tt> is perfectly accurate.
 </translate>

Frequently Asked Questions: Difference between revisions

Frequently Asked Questions (view source)

Revision as of 18:04, 14 March 2018

Navigation menu

Search