The MOAB Scheduler

While Torque manages job submission and execution, MOAB controls all aspects of job scheduling on the cluster. As described in System Architecture, all nodes are divided in four partitions, SharedMem, DistributedMem, GPU and Fermi. Partition names are case-sensitive in the follow MOAB commands. Manual pages are available for all MOAB commands by typing

man <moab-command>

Job states

Every job managed by MOAB exists in one of three states. To determine the state of a job run

checkjob <jobid>
State Description
Running Job is currently in execution on one or more compute nodes.
Deferred Job that has been held by MOAB due to an inability to schedule the job under current conditions. Deferred jobs are held for 1 hour before being placed in the idle queue. This process is repeated 24 times before the job is placed in batch hold.
Hold Job is idle and is not eligible to run due to a user, (system) administrator, or batch system hold
Idle Job is currently queued and eligible to run but is not executing
Blocked Job is currently queued and ineligible to run due to throttling restricts, such as maximum number of cores already in use, or other policy violations. In some cases, when a policy violation is detected the job will be automatically canceled.

checkjob documentation

The best way to estimate a job's start time is to submit it and run

Show start time

showstart <jobid>

The showstart command will provide an estimate of the time that a particular job will start.

09:07:31 # showstart 2165961
job 2165961 requires 8 procs for 1:12:00:00
Estimated Rsv based start in                 3:31:22 on Fri Aug 15 12:40:10
Estimated Rsv based completion in         1:15:31:22 on Sun Aug 17 00:40:10
Best Partition: SharedMem

showstart documentation