Why can't a committed transaction be undone

https://dba.stackexchange.com/questions/282650

13-03-2021
|

Question

Once a transaction has been committed, we cannot undo its effect by aborting it.

We have the log file that has all information to undo a committed transaction, so why is this not possible? We should be able do the same with a committed transaction as we can with an aborted one.

Solution

An aborted transaction in the past has never been "seen" by other transactions, therefore it has had no effect on any data currently in the database now!

A committed transaction (call it X) IS seen by other transactions (say Y and Z... and potentially many others) and their actions (INSERTs, UPDATEs or DELETEs) MAY have depended on the fact that X was, in fact, committed and further transactions (A, B, C, D....) may have depended on Y and Z.

There is (potentially) a huge cascade of transactions that may or may not have had a different outcome depending on preceding transactions - these confounding effects are at least exponential and maybe factorial or even worse.

You would rapidly need more bits on disk than there are atoms on the planet after a relatively short time to keep track of all potential outcomes of this cascade. In a chess game there are **69,353,270,203,366** (69.3 trillion) possible moves (open to debate - see page) in the first 10 (both players) - so one can only imagine the number of possible scenarios if you have even a small number of people performing CrUD operations (r in lower case here because you can't rollback a read (SELECT) operation as no data changes)!

That's why there isn't a mechanism to roll back committed transactions - of course you could just roll back that one transaction X - IF it still left the database in a LOGICALLY consistent state - i.e. according to the DDL, Foreign Keys consistent, CHECK constraints &c.

BUT it might not be consistent from the business/application point of view - stock quantities (just one of a myriad variables that I can think of) might be incorrect... the number of other potential errors is also huge! Think about any triggers that mightn't have been fired when they should have been or vice versa.

It's a galactic scale SNAFU just waiting to happen and is the reason such functionality (thankfully) is not implemented by any of the major RDBMS providers - and nor is it likely to be! The mere fact that it's impossible would not, of course, be an impediment to companies claiming to offer such functionality - the words "snake-oil" spring to mind!

OTHER TIPS

You can achieve that result in Oracle using flashback (https://docs.oracle.com/cd/B28359_01/appdev.111/b28424/adfns_flashback.htm#g1026131).

IBM DB2 has a somewhat similar feature.

You could use database snapshots in older versions of SQL Server (https://docs.microsoft.com/en-us/sql/relational-databases/databases/database-snapshots-sql-server?view=sql-server-ver15) and newer versions have temporal tables (https://docs.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-ver15).

It would be possible to write a DBMS to do this, it's just none of the majors are written in that way. One problem is how to handle changes subsequent to the transaction which is to be retrospectively rolled back.

Most DBMS use physical logging - "in transaction A byte X change value from Y to Z." To rollback the logged action is reversed (byte X changes value from Z to Y). There is no indication of why a value changed. Say we see a certain byte's content changing from 2 to 4 and subsequently to 12. Is that "times 2, times 3" or is that "add 2, add 8"? If we rolled back the first action but retained the second what value would we expect that byte to hold?

There is another way known as logical logging which, effectively, stores the DML operations. To retroactively roll back a committed transaction, the database would be restored to a state before that transaction and the log replayed, skipping the log rows we want removed. Subsequent actions can be replayed and the database ends up in a state consistent with the actions performed. The restore can happen from a backup, a snapshot or a checkpoint each of which would have to be written to persist a consistent representation of the data. While this doesn't feel like a rollback - where we step backward through the log undoing actions as we go - the end result is the same.

There are reasons why logical logging is not used. In the worst-case recovery takes as long as the database has been in existence. In multi-user systems concurrent writes can be difficult to reproduce exactly during replay meaning the restored system risks ending in a different state from the original system.

So I think if you have the source code to a DBMS, a lot of time to spare, and a penchant for system programming this would be achievable. It would be a fun hold-my-beer exercise. It would not, however, be a viable application implementation. For all the reasons other posters have mentioned, that would be a disaster.

We have the log file that has all information to undo a committed transaction, so why is this not possible?

This is exactly what happens when you restore a database to specific point in time. The transaction log entries are used to roll-forward and roll-back committed transactions to the desired point in time, essentially allowing you to undo a committed transaction.

The requirement of course, is that you must overwrite the existing database with a prior backup (or restore a new copy of the database), applying transaction log entries to the desired point in time. This would in effect be exactly what you're looking for.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange