Why doesn't my job start? (overlapping a reservation)


Issue: Users job may overlap reservations, and users do not realize this

 

Affected Version: All Version

 

Symptom: A system has a maintenance reservation scheduled, and a user submits a job with a walltime that overlaps that reservation. The user then wonders why their job won’t start.

 

Solution: Ultimately this is usage issue, and users should be made aware of maintenance reservations. As we all know, however, users do not always remember this. Currently there is a way to address this, at job submission time, using submit filters. This will not work, however, if the reservation is created after a user’s job has been submitted. It still may be better than nothing.

 

Whether the submit filter is a Moab or Torque filter, the general strategy is the same. The script would need to gather reservation information, find the walltime from the job submission, and warn the user if there is an overlap (current time + walltime > next reservation start). This will at least remind users of this possibility.

 

Note that the filter may also try running “showstart <procs>@<walltime>”, but that still only provides a prediction.  Also, a Moab enhancement ticket was opened to hopfully find a better way to accomplish this. 

Last update:
2017-09-29 20:52
Author:
Rob Greenbank
Revision:
1.0
Average rating:0 (0 Votes)

You cannot comment on this entry

Chuck Norris has counted to infinity. Twice.

Records in this category

Tags