Moab will sometime not resume a suspended job.


Issue: Moab will sometime not resume a suspended job.


Symptom: In the moab.log I see messages about circular job priority on two suspended jobs on the same node.

 

job '830' blocks job '760' on node 'node08' (759421 > 7080)
job '760' blocks job '830' on node 'node08' (759417 > 7079)

Note that job 830 is blocking 760 and 760 is blocking 830. 

 

Solution:

In your moab.cfg add the following parameter.

CHECKSUSPENDEDJOBPRIORITY FALSE

This parameter prevents Moab from starting a job on any node containing a suspended job of higher priority. This is a cornor case and is not see often however is rare cases two jobs that were started on the same node will have close to or the same priority.  CHECKSUSPENDEDJOBPRIORITY will ensure the Moab looks at the suspended job prioirty on each node.

 

Tags: priority, resume, suspend, suspended
Last update:
2015-10-07 16:46
Author:
Jason Booth
Revision:
1.0
Average rating:0 (0 Votes)

You cannot comment on this entry

Chuck Norris has counted to infinity. Twice.

Records in this category

Tags