Question

I am working on a brand new SolrCloud - ZooKeeper infrastructure.

Some background information:

  • all other services (mostly web site infrastructure) are distributed across two data centers, with active-active configurations.
  • at the network level, the servers are setup on extended LANS, with dark fibre across the data centers. So latency is minimum.
  • the SolrCloud - ZooKeeper infrastructure will be used by most of these applications.

I got a Solr Cloud, and a ZooKeeper ensemble running. Implementation at this level is fine.

But I wonder how to distribute my ZooKeeper servers. I must have an odd number of servers, but I only have 2 data centers. If one fails, I have a 50-50 chance that I will loose majority.

Any ideas? So far I have thought of:

  • requesting a third data center (not likely to happen, $$$!)

  • host 2 per data center and 1 on an external cloud provider (Amazon or ...?). Again $$$

  • setup an odd number at data center 1 and use an observer on site 2. What then happens if site 1 fails? Can SolrCloud work with only 1 observer?

Thank you for any idea and comments.

Nic

Was it helpful?

Solution 2

Long time ago, but still, lets share what I did.

I got a third site to host the other ZooKeeper instance. This site is another office of my company, not a "full DC". So each site has 1 ZK.

What allowed me to have one cluster spread over 3 DC is that they are close enough together to get a dark fiber between them. The latency is very low and does not impact ZK performance.

Then for SOLR, I got full replicas on the two main DC. The 3rd office only hosts a ZK for quorum. Using full replicas, I have all the data in each DC. If my SOLR needs increase later, I will shard, but for now our index is small.

It has proven solid for 4 years now, with 1 failure. And it was at the 3rd office, not in a DC.

OTHER TIPS

If your requirement is to serve all search requests from local data center (at which request was origin) then you dont need to go for cross dc zk deployment. Cause cross dc zk deployment only needed to survive DC crash (Most likely not going to happen,and that's why you pay $$$$) so in that case no need to spawn zk cluster in multiple DC

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top