Question

I am in the final stages of configuring a service that is accessible from four global locations (with plans to add more later). I will be running the servers on an Ubuntu 12.04 box with MariaDB. My initial thought was to create servers that run independently of each other with 4 distinct databases and live with the constraint that users would only be able to login to the server where they were initially registered.

However, I have just run into this article that has got me thinking... .

From my reading of things if I set up a Galera cluster with master-master replication as suggested in the article I can move have the luxury of one large database that is consistently available across all four servers. I have gathered (and am hoping) that with the cluster setup correctly and functioning well I need do pretty much nothing in my PHP code (the four MariaDB instances will have the same user to access the database) - not even alter the PDO connection string.

However, this sounds almost too good to be true. My questions are:

  • are there other issues involved here that make for complications?
  • Do the PHP PDO connection strings need to be altered in anway?
  • Does the fact that my application is already structured to ensure that there is absolutely zero chance of two servers attempting to simultaneously write the same row help?
  • And finally, reading from the MariaDB docs, that this will not work with the TokuDB storage engine?
  • Is there a way to specifically stop the replication of a selected table? Could I in fact exploit the "only InnoDB/XtraDB" constraint and use another storage engine on the table I do not want to have replicated?
Was it helpful?

Solution

are there other issues involved here that make for complications?

There are some Known Limitations that you should be aware of. Generally, with clusters, you should ideally have an odd number of nodes to prevent split brain conditions, but an even number will usually work just as well.

Do the PHP PDO connection strings need to be altered in anway?

No. Your existing connection strings should work.

Does the fact that my application is already structured to ensure that there is absolutely zero chance of two servers attempting to simultaneously write the same row help?

Look at the known limitations and make sure your application will still do that. If you're using named locks, you'll need to change your application.

And finally, reading from the MariaDB docs, that this will not work with the TokuDB storage engine?

TokuDB support was added in the recent galera cluster distribution. I have used some and it does replicate just like InnoDB but I wouldn't rely on it since it's new in the galera cluster build.

Is there a way to specifically stop the replication of a selected table? Could I in fact exploit the "only InnoDB/XtraDB" constraint and use another storage engine on the table I do not want to have replicated?

I've heard a lot of people ask if they can omit tables or databases from replication but I still haven't heard a good reason why. Galera replication provides HA and is cheap and easy so even if some tables aren't important I can't find any realistic reason to not replicate the data. That being said, you could have data not replicated by using MyISAM/Aria.

I've been using MariaDB with galera in multiple moderately sized projects and it is the best solution I've found for HA and it also provides performance benefits. Other solutions are generally expensive or not mature. One thing you should consider is setting up a proxy for connecting to the database servers like HA Proxy, mysql-proxy, or glbd (which I use) to provide better redundancy and connection balancing for performance.


In response to DroidOS's comment below:

  1. Every write in the cluster needs to be agreed upon by every node so any latency between nodes is added to every write. So, basically, every write will have the greatest round trip time between the writing server and the other nodes added to it.

  2. No. Galera replication is all or nothing across the entire cluster. If any node has a problem writing the data, which can happen if a table doesn't have a primary key, the node will gracefully kill itself since it can't guarantee its data is consistent with the rest of the cluster. If that happens, the rest of the cluster will continue to operate normally. If there is a network issue, if one of the segments has quorum, it will continue to operate normally. Any segments without quorum will wait for more nodes to get quorum but will not accept queries. With this behavior, you can be sure that any node that you are able to query is consistent with the rest of the cluster.

OTHER TIPS

Given that this has turned out to be such a popular question I thought I should add an extra answer by way of comment for anyone who runs into it.

The big issue with synchronous replication is the latency that introduced by the process. There will certainly be times when synchronous replication is required and latency has to be managed and then lived with. However, you might on reflection -as I did - realize that you can live with lazy replication. There are commercial solutions that deliver this albeit at a hefty fee. You also have the possibility of spinning your own solution - easier than you might think.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top