Question

I am running a 4 node cluster setup of apache-cassandra-1.2.8 and trying to load about 25 million records. The cluster is setup with the default (murmur) partitioning and assigned tokens based on the algorithm provided here

The nodetool displays the ring configuration as below:

[root@node1 apache-cassandra-1.2.8]# bin/nodetool -host 10.5.50.250 -p 7199 ring

Datacenter: datacenter1 ========== Replicas: 1

Address Rack Status State Load Owns Token
-461168601842738790
127.0.0.1 rack1 Up Normal 6.29 GB 25.00% 4611686018427387904
127.0.0.2 rack1 Up Normal 613.9 MB 2.50% 0
127.0.0.3 rack1 Up Normal 6.29 GB 25.00% -9223372036854775808
127.0.0.4 rack1 Up Normal 12.13 GB 47.50% -461168601842738790

As you can see, the load is not being distributed evenly (25% on each node). Is my assumption about the murmur protocol wrong here or is my setup not configured properly?

Any insights on how to get good load balancing with the now default murmur partitioning which is claimed to be faster than the previously default random partitioner?

Was it helpful?

Solution

The problem is the token for 127.0.0.4 has a digit missing from the end, it should be -4611686018427387904.

You should also look at using virtual nodes, since it gives you load balancing without calculating tokens and there's no need to rebalanced after scaling your cluster.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top