38,760
edits
(Importing a new version from external source) |
(Importing a new version from external source) |
||
Line 2: | Line 2: | ||
You can see why your jobs are in the <tt>PD</tt> (pending) state by running the <tt>squeue -u <username></tt> command on the cluster.<br><br> | You can see why your jobs are in the <tt>PD</tt> (pending) state by running the <tt>squeue -u <username></tt> command on the cluster.<br><br> | ||
The <tt>(REASON)</tt> column typically has the values <tt>Resources</tt> or <tt>Priority</tt>. | The <tt>(REASON)</tt> column typically has the values <tt>Resources</tt> or <tt>Priority</tt>. | ||
* <tt>Resources</tt>ː The cluster is simply very busy and you will have to be patient or perhaps consider if you can submit a job that asks for fewer resources (e.g. nodes, memory, time). | * <tt>Resources</tt>ː The cluster is simply very busy and you will have to be patient or perhaps consider if you can submit a job that asks for fewer resources (e.g. CPUs/nodes, GPUs, memory, time). | ||
* <tt>Priority</tt>ː Your job is waiting to start due to its lower priority. This is because you and other members of your research group have been over-consuming your fair share of the cluster resources in the recent past, something you can track using the command <tt>sshare</tt> as explained in [[Job scheduling policies]]. | * <tt>Priority</tt>ː Your job is waiting to start due to its lower priority. This is because you and other members of your research group have been over-consuming your fair share of the cluster resources in the recent past, something you can track using the command <tt>sshare</tt> as explained in [[Job scheduling policies]]. |