Question

Greetings,

I'm evaluating some components for a multi-data center distributed system. We're going to be using message queues (via either RabbitMQ or Qpid) so agents can make asynchronous requests to other agents without worrying about addressing, routing, load balancing or retransmission.

In many cases, the agents will be interacting with components that were not designed for highly concurrent access, so locking and cross-agent coordination will be needed to avoid race conditions. Also, we'd like the system to automatically respond to agent or data center failures.

With the above use cases in mind, ZooKeeper seemed like it might be a good fit. But I'm wondering if trying to use both ZK and message queuing is overkill. It seems like what Zookeeper does could be accomplished by my own cluster manager using AMQP messaging, but that would be hard to get really right. On the other hand, I've seen some examples where ZooKeeper was used to implement message queuing, but I think RabbitMQ/Qpid are a more natural fit for that.

Has anyone out there used a combination like this?

Thanks in advance,

-Chris

Was it helpful?

Solution

Coming into this late, but maybe it will be of some use. The primary consideration should be the performance characteristics of your system. ZooKeeper, like you said, is more than capable of implementing a task distribution system using a distributed queue, but zk currently, is more optimized for reads than it is for writes (this only comes into play in the 1000's of ops per second range). If your throughput needs are less than this, then using just zk to implement your system would reduce number of runtime components and make it simpler. Of course, you should always run your performance tests before deciding.

Distributed coordination is really hard to get right, so I would definitely recommend using zookeeper for that and not rolling your own.

OTHER TIPS

Not quite sure what ZooKeeper exactly is, but I guess that using a component from Apache (if it does fit your needs well) is preferred before managing such things as distributed synchronization and group services at your own. You could of course hire a team of developers especially for that purpose, but that doesn't guarantee you a better implementation.

I guess, that it would be anyways implemented as a separate component, cuz other way could bring much complexity and decelerate the workflow; so the preference of ZooKeeper or anything similar is kind of obvious (to me).

And surely, unless you're in the global optimization phase of your project workflow, I guess it would be better to use RabbitMQ or such (I would even stress that, cuz implementations (especially commercial) of the AMQP would be more reliable than everything that you'd come up with).

So I would go for both, carefully chosing the appropriate thirdparty products, but using as much of them as it is needed. And that's just my opinion; thanks for reading :)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top