Does the MySQL NDB Cluster consider node distance? Will it use the replicates if they are nearer?

https://stackoverflow.com/questions/10405433

05-06-2021
|

Frage

I'm building a very small NDB cluster with only 3 machines. This means that machine 1 will serve as both MGM Server, MySQL Server, and NDB data node. The database is only 7 GB so I plan to replicate each node at least once. Now, since a query might end up using data that is cached in the NDB node on machine one, even if it isn't node the primary source for that data, access would be much faster (for obvious reasons).

Does the NDB cluster work like that? Every example I see has at least 5 machines. The manual doesn't seem to mention how to handle node differences like this one.

Lösung

There are a couple of questions here :

Availability / NoOfReplicas

MySQL Cluster can give high availability when data is replicated across 2 or more data node processes. This requires that the NoOfReplicas configuration parameter is set to 2 or greater. With NoOfReplicas=1, each row is stored in only one data node, and a data node failure would mean that some data is unavailable and therefore the database as a whole is unavailable.

Number of machines / hosts

For HA configurations with NoOfReplicas=2, there should be at least 3 separate hosts. 1 is needed for each of the data node processes, which has a copy of all of the data. A third is needed to act as an 'arbitrator' when communication between the 2 data node processes fails. This ensures that only one of the data nodes continues to accept write transactions, and avoids data divergence (split brain). With only two hosts, the cluster will only be resilient to the failure of one of the hosts, if the other host fails instead, the whole cluster will fail. The arbitration role is very lightweight, so this third machine can be used for almost any other task as well.

Data locality

In a 2 node configuration with NoOfReplicas=2, each data node process stores all of the data. However, this does not mean that only one data node process is used to read/write data. Both processes are involved with writes (as they must maintain copies), and generally, either process could be involved in a read.

Some work to improve read locality in a 2-node configuration is under consideration, but nothing is concrete.

This means that when MySQLD (or another NdbApi client) is colocated with one of the two data nodes, there will still be quite a lot of communication with the other data node.

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow