I'm running jobs on a Torque server I don't have admin access to. However often it seems a slot will claim it is free but when the scheduler assigns a job to it, the job stays queued. As such a load of my jobs will get assigned to that server and stay queued indefinitely.

Is there any way to start these jobs/move them to a different machine? Or maybe avoid the machine when submitting jobs? Thanks!

有帮助吗?

解决方案

If you don't have root permissions, you may need your system administrator to unblock the job for you.

  1. Check the qstat -f output and see why the job was blocked. It could be you requested more resources than are available.

  2. In the future specify exactly which node you want (assuming your admin allows that) qsub somejob.sh -l

  3. If you use a scheduler such as Moab, try the following (again, assuming you have rights ) mjobctl -u mjobctl -l
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top