Question

I'm running jobs on a Torque server I don't have admin access to. However often it seems a slot will claim it is free but when the scheduler assigns a job to it, the job stays queued. As such a load of my jobs will get assigned to that server and stay queued indefinitely.

Is there any way to start these jobs/move them to a different machine? Or maybe avoid the machine when submitting jobs? Thanks!

Was it helpful?

Solution

If you don't have root permissions, you may need your system administrator to unblock the job for you.

  1. Check the qstat -f output and see why the job was blocked. It could be you requested more resources than are available.

  2. In the future specify exactly which node you want (assuming your admin allows that) qsub somejob.sh -l

  3. If you use a scheduler such as Moab, try the following (again, assuming you have rights ) mjobctl -u mjobctl -l
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top