Question

I use an ConsistentHashingRouter in order to distribute data from an actor to a set of other actors. Each message contains of a a tuple like this (items: Set[Int], msg: String). I wrote an case class that implements ConsistentHashable with defining the set of integers to be the consistent hash key like this.

case class Message(items: Set[Int], msg: String) extends ConsistentHashable {
   def consistentHashKey = items
}

Now when I let some producer actor send lots of messages to the router, the router distributes them quite unevenly to the target nodes.

I tried it with a different number of target actors. In all cases the actor with the most messages received got more than double the amount of messages than the actor with the least messages received.

When using hashing, I would expect the messages to be distributed evenly amongst the targets or am I missing something here?

Was it helpful?

Solution

consistentHashKey returns the object that will be used to calculate the hash key (if you don't return a String or byte array it will apply MurMurHash to the serialized bytes of that object). I don't know how equidistributed is that, you should look at the "items" values you encounter -- those might quite biased.

Also, consistent hashing does not distribute totally evenly. See: http://en.wikipedia.org/wiki/Consistent_hashing

In short, the interval of hash keys is wrapped to itself to form a ring, and this ring is subdivided by random points (hash of the nodes) into intervals (buckets). These buckets can end up in non-equal sizes. Usually more nodes you have more "equal" they will be -- but this is not guaranteed.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top