Pergunta

I'm trying to wrap my head around a hypothetical scenario.

Imagine we have a very write-heavy distributed database and sharding is not an option.

I wonder if / how is multi-leader replication (write-write) more efficient than single-leader scenario (write-read) since write-write has overhead to sync databases and propagate writes to the other master(s) ending up with the same number of write operations, ultimately.

In which cases is multi-leader replication for write-heavy applications considered more performant than single-leader and in which cases it is not?

I understand the question is broad and nuanced but would be happy to read some thoughts on the subject.

Foi útil?

Solução

Every write is performed on every machine, whether it is a Master or a Slave. The corollary to that is that there is a limit to write scaling with traditional Replication. Even Galera and Group Replication are just variants on Master(s) and Slave(s). Each server has all the data, and it is kept as up-to-date as is practical.

Hence, Sharding is the only way to get write scaling.

OK, let me try to answer your question anyway.

With RBR (Row-Based-Replication), the Slave can usually do a write (INSERT/UPDATE/DELETE/etc) with less effort than the Master.

That implies that having multiple Masters will spread the "Master effort" out somewhat. Hence, having more Masters helps in scaling, and moving more reads off to Slaves help, too.

Some configurations, and my comments:

M <-> M "Dual Master" -- If you are writing to both (as implied above), then it is a somewhat fragile system.

M1 -> M2 -> M3 -> M1 "Circular Masters" -- very fragile. If one server goes down, it is a nightmare to repair.

3+ Masters, each talking to each other "Galera Cluster" -- This solves the above fragilities, but suffers if there is large latency between Masters. (latency to get geographic separation to get HA.) This is about the best config for some write scaling today.

"Group Replication" -- a competitor of Galera when it comes to HA. I'm not sure about write scaling.

In all of the above, Slave(s) can hang off any or all Masters to provide read scaling.

For write scaling (and not HA), I would go with 3 Galera nodes in the same server room. If there is enough reads to worry about, then tack some readonly Slaves on.

Note: Going past 3 nodes in Galera (or maybe it is 5), you risk some degraded performance. This is due to the fact that each node must talk to each other node during each COMMIT.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a dba.stackexchange
scroll top