Idioms or algorithms for distributed transactions?

https://stackoverflow.com/questions/5386510

28-10-2019
|

Question

Imagine you have 2 entities on different systems, and need to perform some sort of transaction that alters one or both of them based on information associated with one or both of them, and require that either changes to both entities will complete or neither of them will.

Simple example, that essentially has to run the 2 lines on 2 separate pieces of hardware:

my_bank.my_account -= payment
their_bank.their_account += payment

Presumably there are algorithms or idioms that exist specifically for this sort of situation, working correctly (for some predictable definition of correct) in the presence of other attempted access to the same values. The two-phase commit protocol seems to be one such approach. Are there any simpler alternatives, perhaps with more limitations? (eg. Perhaps they require that no system can shutdown entirely or fail to respond.) Or maybe there more complex ones that are better in some way? And is there a standard or well-regarded text on the matter?

Solution

There's also the 3PC "3 Phase Commit Protocol". 3PC solves some of the issues of 2PC by having an extra phase called pre-commit. A participant in the transaction receives a pre-commit message to know that all the other participants have agreed to commit, but have not done it yet. This phase removes the uncertainty of the 2PC when all participants are waiting for either a commit or abort message from the coordinator.

AFAIK - most databases work just fine with 2PC protocol, because in the unlikely conditions that it fails, they always have the transaction logs to undo/redo operations and leave the data in a consistent state.

Most of this stuff is very well discussed in

"Database Solutions, second edition"

and

"Database Systems: The Complete Book"

More in the distributed world you might want to check current state of Web Service technology on distributed transactions and workflows. Not my cup of tea, to be honest. There are frameworks for Python, Java and .Net to run this kind of services (an example).

As my last year project, some years ago, I implemented a distributed 2PC protocol on top of Web Services and I was able to run transactions on two separate databases, just like the example you gave. However, I am sure today people implement this in a most restful-alike approach, for instance see here. Even though, some other protocols are mentioned in these links, in the end they all end up implementing 2PC.

In summary, a 2PC protocol implementation with with proper operation logs to undo/redo in case of crash is one of the most sensible options to go for.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow