How to improve the liveness of a system when having a long-running transaction blocking multiple reading transactions

https://dba.stackexchange.com/questions/281899

12-03-2021
|

题

Is there any recommended approach for improving the application liveness when having, over the same database object, multiple reading transactions, and one long-running transaction which is updating the data?

For example, a transaction Ti is updating several records of [CUSTOMER] table (which can last hours long), and multiple Tj transactions are trying to read the number of orders of each client. I'm assuming that if Ti gets an exclusive lock, all Tj transactions will be suspended resulting in a bad performance (low liveness).

My application is at design time, and I haven't chosen a concrete DBMS yet.

My goal is that Tj are served with the newest version of [CUSTOMER] before Ti began and that Ti doesn't block Tj.

解决方案

You could switch to an optimistic concurrency control (OCC). Typically writers do not block readers. The throughput will definitely improve in the scenario you describe. Depending how you define "liveness" it may be worse than present.

OCC gives each transaction a snapshot of the data. Typically this represents committed values at the point the transaction started though details vary. So a transaction may not read the most recently committed value since that value was written after the transaction started. It is a trade-off between latency and correctness. Choose which is preferable for this scenario.

其他提示

SQL Server offers Read Committed Snapshot Isolation and readers don't blocker writers and vice versa. As Mustaccio said though, the real culprit or problem here is that an update really shouldn't take hours. When done properly, with good table design that includes proper normalization, changing date should take a matter of seconds.

As mustaccio mentioned, it's not really normal to have a single UPDATE transaction that takes hours long. This is indicative of either a data design problem or an implementation problem that requires performance tuning. If you could provide more concrete examples of what you've tried with databases that resulted in a multi-hour update, I guarantee recommendations could be made on how to either improve the design or implementation.

For example, in Microsoft SQL Server, I can update millions of records in a table consisting of billions of records in seconds with the appropriate indexing.

To answer your direct question, improving concurrency will not fix your multi-hour update unless it's solely due to readers locking the table for the update, but you describe your issue as the other way around.

许可以下： CC-BY-SA 和归因

不隶属于 dba.stackexchange