Eventual consistency in plain English

https://stackoverflow.com/questions/10078540

30-05-2021
|

Domanda

I often hear about eventual consistency in different speeches about NoSQL, data grids etc. It seems that definition of eventual consistency varies in many sources (and maybe even depends on a concrete data storage).

Can anyone give a simple explanation what Eventual Consistency is in general terms, not related to any concrete data storage?

Soluzione

Eventual consistency:

I watch the weather report and learn that it's going to rain tomorrow.
I tell you that it's going to rain tomorrow.
Your neighbor tells his wife that it's going to be sunny tomorrow.
You tell your neighbor that it is going to rain tomorrow.

Eventually, all of the servers (you, me, your neighbor) know the truth (that it's going to rain tomorrow), but in the meantime the client (his wife) came away thinking it is going to be sunny, even though she asked after one or more of the servers (you and me) had a more up-to-date value.

As opposed to Strict Consistency / ACID compliance:

Your bank balance is $50.
You deposit $100.
Your bank balance, queried from any ATM anywhere, is $150.
Your daughter withdraws $40 with your ATM card.
Your bank balance, queried from any ATM anywhere, is $110.

At no time can your balance reflect anything other than the actual sum of all of the transactions made on your account to that exact moment.

The reason why so many NoSQL systems have eventual consistency is that virtually all of them are designed to be distributed, and with fully distributed systems there is super-linear overhead to maintaining strict consistency (meaning you can only scale so far before things start to slow down, and when they do you need to throw exponentially more hardware at the problem to keep scaling).

Altri suggerimenti

Eventual consistency:

Your data is replicated on multiple servers
Your clients can access any of the servers to retrieve the data
Someone writes a piece of data to one of the servers, but it wasn't yet copied to the rest
A client accesses the server with the data, and gets the most up-to-date copy
A different client (or even the same client) accesses a different server (one which didn't get the new copy yet), and gets the old copy

Basically, because it takes time to replicate the data across multiple servers, requests to read the data might go to a server with a new copy, and then go to a server with an old copy. The term "eventual" means that eventually the data will be replicated to all the servers, and thus they will all have the up-to-date copy.

Eventual consistency is a must if you want low latency reads, since the responding server must return its own copy of the data, and doesn't have time to consult other servers and reach a mutual agreement on the content of the data. I wrote a blog post explaining this in more detail.

Think you have an application and its replica. Then you have to add new data item to the application.

Then application synchronises the data to other replica show in below

Meanwhile new client going to get data from one replica that not update yet. In that case he cant get correct up date data. Because synchronisation get some time. In that case it haven't eventually consistency

Problem is how can we eventually consistency?

For that we use mediator application to update / create / delete data and use direct querying to read data. that help to make eventually consistency

When an application makes a change to a data item on one machine, that change has to be propagated to the other replicas. Since the change propagation is not instantaneous, there’s an interval of time during which some of the copies will have the most recent change, but others won’t. In other words, the copies will be mutually inconsistent. However, the change will eventually be propagated to all the copies, and hence the term “eventual consistency”. The term eventual consistency is simply an acknowledgement that there is an unbounded delay in propagating a change made on one machine to all the other copies. Eventual consistency is not meaningful or relevant in centralized (single copy) systems since there’s no need for propagation.

source: http://www.oracle.com/technetwork/products/nosqldb/documentation/consistency-explained-1659908.pdf

Eventually consistency means changes take time to propagate and the data might not be in the same state after every action, even for identical actions or transformations of the data. This can cause very bad things to happen when people don’t know what they are doing when interacting with such a system.

Please don’t implement business critical document data stores until you understand this concept well. Screwing up a document data store implementation is much harder to fix than a relational model because the fundamental things that are going to be screwed up simply cannot be fixed as the things that are required to fix it are just not present in the ecosystem. Refactoring the data of an inflight store is also much harder than the simple ETL transformations of a RDBMS.

Not all document stores are created equal. Some these days (MongoDB) do support transactions of a sort, but migrating datastores is likely comparable to the expense of re-implementation.

WARNING: Developers and even architects who do not know or understand the technology of a document data store and are afraid to admit that for fear of losing their jobs but have been classically trained in RDBMS and who only know ACID systems (how different can it be?) and who don’t know the technology or take the time to learn it, will miss design a document data store. They may also try and use it as a RDBMS or for things like caching. They will break down what should be atomic transactions which should operate on an entire document into “relational” pieces forgetting that replication and latency are things, or worse yet, dragging third party systems into a “transaction”. They’ll do this so their RDBMS can mirror their data lake, without regard to if it will work or not, and with no testing, because they know what they are doing. Then they will act surprised when complex objects stored in separate documents like “orders” have less “order items” than expected, or maybe none at all. But it won’t happen often, or often enough so they’ll just march forward. They may not even hit the problem in development. Then, rather than redesign things, they will throw “delays” and “retries” and “checks” in to fake a relational data model, which won’t work, but will add additional complexity for no benefit. But its too late now - the thing has been deployed and now the business is running on it. Eventually, the entire system will be thrown out and the department will be outsourced and someone else will maintain it. It still won’t work correctly, but they can fail less expensively than the current failure.

In simple English, we can say: Although your system may be in inconsistent states, the aim is always to reach consistency at some point for each piece of data.

Eventual consistency is more like a spectrum. On one end you have strong consistency and on other you have eventual consistency. In between there are levels like Snapshot, read my writes, bounded staleness. Doug Terry has a beautiful explanation in his paper on eventual consistency thru baseball .

As per me eventual consistency is basically toleration to random data in random order every time you read from a data store. Anything better than that is a stronger consistency model. For example, a snapshot has stale data but will return same data if read again so it is predictable. Sometimes application can tolerate data which is stale for a given amount of time beyond which it demands consistent data.

If you look at meaning of consistency it relates more to uniformity or lack of deviation. So in non computer system terms it could mean toleration for unexpected variations. It could be very well explained thru ATM. An ATM could be offline hence divergent from account balance from core systems. However there is a toleration for showing different balances for a window of time. Once the ATM comes online, it can sync with core systems and reflect same balance. So an ATM could be said to be eventually consistent.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow