Issue: Why does MAM have double entries for a job charge?
Affected Versions: All version before 8.1.0
Symptom:
Id Type Instance Charge Stage User Group Account Organization Class QualityOfService Machine Nodes Processors CPUTime Memory Disk Duration StartTime EndTime Description ------ ---- ------------- ------ ------ -------- -------- -------- ------------ ----- ---------------- ------- ----- ---------- ------- ------ ---- -------- ------------------- ------------------- ----------- 957169 Job site1.430686 0.000 Charge science2 science2 scienc020 site1 someq site1 6 144 2 44 2015-06-18 08:45:08 2015-06-18 08:45:52 957480 Job site1.430686 0.000 Charge science2 science2 scienc020 site1 someq site1 6 144 2 46 2015-06-18 13:52:31 2015-06-18 13:53:17 958304 Job site1.430686 0.000 Charge science2 science2 scienc020 site1 someq site1 6 144 63 2015-06-18 17:43:26 2015-06-18 17:44:29 959345 Job site1.430686 0.000 Charge science2 science2 scienc020 site1 someq site1 6 144 61 2015-06-19 10:24:31 2015-06-19 10:25:32
Solution:
For now the best solution is to update to the latest version of 8.1.
We have seen a problem just like this where Torque reports the job state to Moab as Completed, then it later reverts the reported state to Exiting, then back to Complete. To confirm this, you can look in the Torque pbs_server logs to see if you see the job state switching from COMPLETE-COMPLETE to EXITING-EXITED. This is a Torque problem that, to my knowledge, has not been fixed (TRQ-2870).
In spite of the problem still existing in Torque, our MAM developer has implemented a check in Moab that prevents issuing the final charge more than once (MOAB-7565). This was fixed in 8.1 and upgrading to 8.1. should resolve the issue.
Jira: MOAB-7565
Tags: charge, double, double charge, mam