Riak: Using n_val = 3 and only 3 nodes

Question

With only 3 nodes, it is not possible to fulfil the n_val requirement and ensure that the three copies stored of any object always will be on different nodes. The reason for this is in how Riak distributes replicas.

When storing or retrieving an object, Riak will calculate a hash value based on the bucket and key, and map this value to a specific partition on the ring. Once this partition has been determined the other N-1 replicas are always placed on the following N-1 partitions. If we assume we have a ring size of 64 and name these partitions 1-64, an object that hashes into partition 10 and belongs to a bucket with n_val set to 3 will also be stored in partitions 11 and 12.

With 3 nodes you will often see that the partitions are spread out alternating between the physical nodes. This means that for most partitions, the replicas will be on different physical nodes. For the last 2 partitions of the ring, 63 and 64 in our case, storage will however need to wrap around onto partitions 1 and 2. As 64 can not be evenly divided by 3, objects that do hash into these last partitions will therefore only be stored on 2 different physical nodes.

When a node fails or becomes unavailable in Riak, the remaining nodes will temporarily take responsibility for the partitions belonging to the lost node. These are known as fallback partitions and will initially be empty. As data is updated or inserted, these partitions will keep track of it and hand it back to the owning node once it becomes available. If Active Anti-Entropy is enabled, it will over time synchronise the fallback partition with the other partitions in the background.