Some followup questions about consistent hashing

https://stackoverflow.com//questions/21059047

23-12-2019
|

Question

I have read through a few articles that explain the theory behind consistent hashing. But most of them doesn't give much details about how to handle add/remove a node. I understand if it is used in cache layer like memcached, we might not need to do anything but if it is used in distributed storage, it is very critical to move some data to correct node. What exactly happened when we need to add/remove a node?

A few other questions are:

what’s the best way to cope with servers of different sizes
how to add and remove more than one machine at a time
how to cope with replication and fault-tolerance

Hope someone could point me to an article that explain these.

Solution

But most of them doesn't give much details about how to handle add/remove a node.

Have you read Dynamo: Amazon’s Highly Available Key-value Store? This is covered in some detail in section 4.

what’s the best way to cope with servers of different sizes

There's nothing stopping you from putting different amounts of data on different servers in a Dynamo-like or Cassandra-like system. It would add significant amounts of complexity, especially in the failure recovery cases, but doesn't break the fundamentals of the protcol in any way.

OTHER TIPS

To your original question, you just need some other way of doing a hand-off of an object from one node to another. That's probably more a factor of how you add and remove nodes, than how consistent hashing works.

One hypothetical way to approach this is to consider there to be an intermediate state when rebalancing, where you have two consistent hashes, one for the old topology, and one for the new topology. Take the difference between them, and each server should be able to tell what it needs to do to conform to the new topology. For instance, each server requests objects it doesn't have, from the old servers, using the old consistent hash. Once we've reached some sort of 'done' state, we can drop the old copies (if necessary).

Consistent hashing allows you to give a weight to servers, by adjusting the number of replicas. This is probably the best advantage of consistent hashing over HRW.
This is probably the same as just adding one machine.
HRW has a useful idea, that consistent hashing doesn't have as typically explained, as getting subsequent nodes. That is, instead of giving you just "the" node for an object, it can give you "the ordered-list of nodes for an object." If you decide you want, say, 3 replicas for each object, you take the top three from the list, instead of just the first.

You can get the same effect in consistent hashing, although it's less intuitive: just keep going around the "circle" until you have N unique nodes.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow