Issue: How can I sync my Moab and TORQUE batch job IDs?
Affected Versions: All
Symptom: When submitting jobs, you see different job IDs for Moab and for TORQUE. For example: showq -v displays 2965/38465.localserver.pbs
Solution: Set the following parameters in Moab. This will allow Moab to tell the resource manager that it should honor the job ID Moab sends it.
RMCFG[torque] TYPE=PBS
RMCFG[torque] SYNCJOBID=TRUE
RMCFG[torque] FLAGS=PROXYJOBSUBMISSION
RMCFG[internal] JOBIDFORMAT=INTEGER
Refer to the Moab Workload Manager Administrator Guide for your version and operating system, under this section: "Resource Managers and Interfaces > Resource Manager Configuration > Synchronizing Job IDs in Torque and Moab". [Documentation home]
Additionaly, Moab must have manager and operator permissions in qmgr (if Torque and Moab both run as root on the same server, then you already have this enabled). Example:
# qmgr -c "set server managers+=root@moabserver"# qmgr -c "set server operators+=root@moabserver"
Additional Considerations: Currently it is not possible to sync every job's job ID. For example: if an interactive job is submitted through qsub, then TORQUE will give that job the next job ID found in qmgr (displayed as "set server next_job_number = 83397"), regardless of what Moab thinks the next job ID should be. Likewise when using SLURM. It is suggested that you use a different job ID range for interactive jobs or jobs submitted via the resource manager. For example, you can say all non-interactive jobs consume job IDs 1-800,000. Then, in the resource manager, configure the job IDs to be from 900,000 to one million. The following example shows how to configure with different job IDs for scheduler and resource manager.
moab.cfg
SCHEDCFG[Moab] MAXJOBID=499999 MINJOBID=1000 RMCFG[slurm] SYNCJOBID=TRUE EPORT=10777 RMCFG[internal] JOBIDFORMAT=integer
slurm.conf
FirstJobId=500000 MaxJobId=1000000Tags: Sync, sync jobid