promoting a master in replication

https://stackoverflow.com/questions/12810469

06-07-2021
|

Domanda

I am designing a replication algorithm, to promote a master among many slaves. I want it to be faster and simpler than Paxos. The basic idea is:

Assign each node a 'Promotion Priority', for example for 5 nodes there would be priorities: 50,40,30,20 and 10, 50 the highest and 10 the lowest.
When master needs to be elected, all slaves will send (at the same time) the other 4 nodes a message requesting to become a master, but only that master will be elected that will be confirmed by all slaves with a confirmation message. A slave will send confirmation message if its own 'Promotion Priority' is lower than the asking node, or if the asking node with higher priority times out to issue rejection message for its own request.
If a slave receives a rejection message from slave with higher 'Promotion Priority' it will abort the procedure.
There should be no nodes with the same priority.
There will be a minimum number of confirmation messages that a slave should collect in order to become a master.

This algorithm should be faster because all the slaves will be electing a master in parallel and the priority will help to speed up the process.

What do you think about it? Does any other algorithm for master promotion with priority exists?

Soluzione

What do you think about it?

It is hard to completely assess the validity of you algorithm without knowing the details of your requirements. Overall, it looks like a valid approach, but there are a few issues that I think deserve some attention.

Your question has some similarities to A distributed algorithm to assign a shared resource to one node out of many. Consequently, some of the arguments raised in my answer to that question hold for this question as well.

When master needs to be elected, all slaves will send (at the same time) the other 4 nodes a message requesting to become a master, but only that master will be elected that will be confirmed by all slaves with a confirmation message.

This approach assumes that all slaves know how many slaves are present at any time -- otherwise the supposed master can never draw the conclusion when it has received a confirmation from all slaves. Implicitly, this means that no slaves can leave and join the system without breaking the algorithm.

In practice though, these slaves will come and go, because of crashes, reboots, network outages etc. The chances of this increase with the number of slaves, but whether or not this is a problem depends on your requirements. How fault tolerant does your system have to be?

By the way, since you mention that there are many slaves, I assume that you are using multicast or broadcast to send the request messages. Otherwise, depending on what many means to you, your set-up could be error prone with regard to administrating where all slaves reside.

A slave will send confirmation message if its own 'Promotion Priority' is lower than the asking node, or if the asking node with higher priority times out to issue rejection message for its own request.

Similar to the previous remark: a slave might draw the wrong conclusion if some slave has problem responding for whatever reason. In fact, if one slave is down or has a network problem, all other slaves will draw the same (most likely erroneous) conclusion that the non-responsive slave is the master.

This algorithm should be faster because all the slaves will be electing a master in parallel

The issues raised in this answer are almost inherent to doing the master selection in a distributed fashion though, and hard to resolve without introducing some kind of centralized decision maker. You gain some, you lose some...

Does any other algorithm for master promotion with priority exists?

Another approach would be to have all slaves in the system constantly maintain administration about who is the current master. This could be done (at the cost of some network bandwidth) by having every slave multicasting/broadcasting its priority periodically, via some sort of heartbeat message. As a result, every slave will be aware of every other slave, and at the moment that a master needs to be selected, every slave can do that instantly. Network issues or other "system health" problems will be detected because heartbeats are missed. This algorithm is flexible with regard to slaves joining and leaving the system. The higher the heartbeat frequency, the more responsive your system will be to topology changes. However, you might still run into issues of slaves running drawing independent conclusions because of a disconnected network. If that is a problem, then you might not be able to solve this in a completely parallel fashion.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow