سؤال

http://en.wikipedia.org/wiki/CAP_theorem

http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf

I think it is not very straightforward why only two of

  1. Consistency
  2. Availability
  3. Partition tolerance

can hold for any given distributed database system. This conjecture was proved but is there an easier way to see why probably this might hold?

I am not looking for a proof, just a good way to understand why this theorem might make sense. What is the reasoning?

هل كانت مفيدة؟

المحلول

OK, let's imagine that you have a distributed database. Let's say you have a node in Oregon and one in California. The CAP theory says that you will run into problems when setting up this type of database.

For example, if you query data from one database, it needs to be the same as the data in the other database. This insures that whatever value you have in one database is guaranteed to be in the other (Consistency of the CAP theory). Doing this allows you to update the data in one database and query it from another, getting the same results.

A computer updating data in Oregon, transfers the data to California

When we update the data in the Oregon node, the data is sent to the California node so that the databases are consistent. In order to truly maintain consistency, we have to insure that both databases get the update before either are allowed to truly save the data (two-phase commit using distributed transactions). In other words, if the California database can't save the data for some reason (e.g. hard drive failure), then the database in Oregon will not save the data and will fail the transaction.

The problem with distributed transactions like the one above comes when we want to have high availability. In this scenario above, the process of trying to get both databases in sync is a very, very slow process. (Imagine, we have to send the data from Oregon to California, make sure it gets there, make sure that both databases have locks on the data, etc.) This causes major problems when we want a system that is fast and responsive even during times of high demand. (This is the Availability of the CAP theorem.)

Usually, what we do in order to insure high availability is we use replication instead of distributed transactions. So instead of guaranteeing that California can accept the data, we just go ahead and store it in the Oregon node and then send the data to California when we get around to it. This guarantees that we can always store the data, regardless whether California is ready to store the data or not.

The Oregon node updates the data while California reads the data.  Later, the data is moved to California

This improves Availability, but at the cost of Consistency. See, if someone updates the data in Oregon and then someone (at the same time) reads the data in California, they are not getting the new data--the databases are no longer consistent. In fact, they won't be consistent until Oregon sends the data to California!

So, that's the Availability -vs- Consistency trade-off.

Partition Tolerance is the third aspect of the CAP theory. Partitioning is, in this context, the idea that a database (or other distributed system) can break into separate sections and still function correctly.

The question becomes, what happens when both databases are running correctly, but the link from Oregon to California is severed?

Oregon is being updated while the California node is being read.  The network between the nodes is severed.

If we update the database in Oregon, we need to get the data to California one way or another (distributed transaction or replication). However, if the link between the two is severed, then the system has become partitioned and the databases are no longer linked together.

When this happens, your choices are to stop allowing updates (to maintain Consistency) at the cost of Availability or to allow the updates (to maintain the Availability) at the cost of consistency.

As you can see, partition tolerance creates direct trade-offs between Consistency and Availability.


There's obviously more to it than that, but those are a couple of examples on how these three major aspects of distributed systems work for and against each other. Julian Browne's explanation of CAP theory is an excellent place to learn more.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى dba.stackexchange
scroll top