Issue: Missing checkpoint (.cp) file for job submitted from a submit host.
Affected Version: All versions, Moab with Torque
Symptom: A job is submitted from a submit host, and no checkpoint file appears in the Moab spool directory. Often there may be data needed by scripts in that file.
Solution: There is a Moab configuration flag that can be applied to the resource manager configuration that will cause all of the job information to be copied before the job is run. The flag is FULLCP, and can be configured like this:
RMCFG[pbs] TYPE=PBS SUBMITCMD=/usr/local/bin/qsub FLAGS=FULLCP
This became a problem during the DataWarp project, which does gather information from the checkpoint files. The DataWarp submit filter creates additional storage jobs, whose checkpoint files showed up on the Moab server, but the original job checkpoint file did not.