Issue: Job are being deferred because a connection cannot be made to the MAM database.
Affected Versions: 8.0.x
Symptom: When Moab tries to start a job is can not because MAM can not connect to the database evethough the database is up and MAM has access.
Message[0] Failure registering job Start (12345) with accounting manager -- server rejected request with status code 730 - Failed obtaining database connection: DBI connect('dbna
me=gold;host=localhost','gold',...) failed: FATAL: remaining connection slots are reserved for non-replication superuser connections at /opt/mam/lib/Gold/Database.pm line 185.
Message[1] cannot modify job - cannot set job '12345.pbserver.local' attr 'comment:' to 'Failure registering job Start (12345) with accounting manager' (rc: 15018 'Reques
t invalid for state of job MSG=invalid state for job - COMPLETE')
Solution: In this case MAM was making alot of connections to the database. This is an issue with thread/connection count for MAM.
https://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE,_EXPLAIN,_and_COUNT#Using_ANALYZE_to_optimize_PostgreSQL_queries
http://www.postgresql.org/docs/9.1/static/routine-vacuuming.html
You can also do a ANALYSE on the database and this should increase the speed.
You can also set DELETEISBLOCKING to improve responsiveness while deleating jobs.
Tags: database, db, refused