Riak "w" is less than "n_val" clarification

https://stackoverflow.com/questions/23585048

riak

19-07-2023
|

Question

Consider Riak cluster with 5 nodes (A,B,C,D,E) in it, n_val = 3:

1) Coordinator node store (k,v) pair with w=2 that should go to node A and replicas should go to nodes B and C, according to consistent hashing. Consider node C is down. Riak is able to perform writes to two nodes - A and B, thus satisfying w=2. However, (k, v) should be eventually replicated to 3 nodes, does this mean that Riak will send this store to D and D will perform hinted handoff when C is back? Or just writes to A and B would be performed and C will synchronize with these nodes using Active Anti-entropy and read repair?

2 Consider I would like to decommission node C from the cluster. I simply shutdown this node. This node contained data that is replicated on nodes D and E as well as replicas for nodes A and B. Now n_val = 3 is no longer satisifed, we only have two replicas. Will Riak automatically create new replicas for node that is down or should I execute special command to mark node C as permanently down?

3) Consider Riak cluster with 3 nodes (A, B, C), n_val=3 and node C is down. Will it be able to satisfy write with w=2?

Solution

1) Riak will make use of fallback vnodes, so in the event node C is down during the write, node D will start a fallback vnode to handle requests until it becomes available again. As soon as C becomes available, D will intitiate hinted handoff to bring the vnodes on node C up to date. The use of fallbacks is described here

2) If you are removing node C while it is still able to function and wish to run a smaller cluster, use cluster leave to cause Riak to reassign ownership and transfer the data before shutting down node C.

If you are removing node C to replace it new hardware, first join then new node, but use replace before plan or commit.

If node C has failed such that it's data is unrecoverable, you can use force-remove or force-replace to have new empty vnodes started to replace the lost ones, which will the be populated via AAE or read repair.

3) Yes, Riak uses sloppy quorums where a fallback vnode can be used to satisfy a read or write quorum. If you want to only consider primary vnodes, use pr or pw in the request instead of r or w. See Eventual Consistency for more detail.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow