I figured it out
$ ssh node27
worked for me
سؤال
I lost connection to a cluster, and when i logged back in, i noticed that my calculations are still running on the node that i was working in. How can i log back into that specific node? i tried:
$qlogin -l h=node27
i get the following:
Your job 33551 ("QLOGIN") has been submitted waiting for interactive job to be scheduled ...timeout (5 s) expired while waiting on socket fd 4
Your "qlogin" request could not be scheduled, try again later.
What can i do?
المحلول
I figured it out
$ ssh node27
worked for me
نصائح أخرى
This is likely happening because the node you've requested is in use, or the scheduler does not think you are eligible to run jobs on it.
While you may be able to ssh
to the node, this is not the same as requesting resources with qlogin
, and will circumvent the job scheduler, potentially overcommitting the node.
If you have confirmed with the cluster admin that you should be able to run jobs on this node with qlogin
, you can wait for sufficient resources to become available on that node with:
qlogin -l h=node27 -now n
The -now n
option tells qlogin
not to give up if the resources you've requested aren't immediately available.