Question

I am using celery and zookeeper (kazoo lock) to lock my workers. I have a problem when I kill (-9) one of the workers before releasing the lock then that lock stays locked forever.

So my question is: Does killing the process release locks in that process or is this some bug in zookeeper?

Was it helpful?

Solution 2

Killing a process with a kill signal will do nothing to clear "software locks" such as ZooKeeper locks.

The only kind of locks killed by a KILL signal are OS-level locks, since all file descriptors are killed, and file descriptor locks are therefore killed as well. But as far as ZooKeeper is concerned, those are not OS level locks (would it be only because the ZooKeeper process, even on the same machine, is not the one of your python process).

It is therefore not a bug in ZooKeeper, and an expected behavior of your kill -9.

OTHER TIPS

Zookeeper locks use ephemeral nodes. An ephemeral node is a node that lives as long as the session that created it is alive. Sessions are kept alive by the process creating the session periodically sending a heartbeat message to zookeeper.

So if you kill the process that created the lock, the lock will eventually be released, as the session will die as zookeeper no longer receives heartbeats.

So killing a worker before the lock is released should eventually release the lock.

If the lock is never released, a couple things could be happening,

  1. Someone else noticed the lock was released and obtained it. Presumably you are locking because there is contention, and some other process will try and acquire the lock when it is released.
  2. You aren't waiting long enough. When you connect to zookeeper there should be a session timeout parameter you set, that is how long the server will keep the session alive without hearing any heartbeats, you have to wait this long to see the locks released
  3. There is a bug in kazoo. This is possible, but it looks like the kazoo lock recipe uses ephemeral nodes, and the use case you describe is a very basic one.

It is very unlikely this is a zookeeper bug.

How do you know the lock is not being released?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top