Haskell: How does TVar work?

https://stackoverflow.com/questions/10092655

30-05-2021
|

Question

How does TVar work? From what I've read it attempts to run all transactions immediately upon receiving them, however, a transaction completing invalidates other currently running transactions, which must then restart. Is this how TVar works?

If this was the case, if there were transactions 1ms long transactions occurring every 100ms, would that mean that a transaction that takes 200ms to process would never complete?

Solution

As long as two transactions access distinct TVars, they can both be committed simultaneously without invalidating each other.

Just to make it clear when a transaction is invalidated, let's consider the following scenario:

Suppose that t :: TVar Int is initialized to 0 and is read via readTVar t at the beginning of a transaction A.
Meanwhile, in another thread, transaction B is started in which a writeTVar t 1 is executed. Assume that B commits before A. The STM system will check whether there are any inconsistencies and conclude that it is safe for B to commit at this point, so now writeTVar t 1 becomes effective.
This, however, causes transaction A to be invalidated since the old value 0 of t was read at the beginning of A. (If A was allowed to commit, we would get a violation of atomicity.)

The original paper [1] on Haskell's STM system (see Sec 6.5) answers your question:

"Starvation is possible. For example, a transaction that runs for a very long time may repeatedly conflict with shorter transactions. We think that starvation is unlikely to occur in practice, but we cannot tell without further experience."

[1] Tim Harris, Simon Marlow, Simon Peyton Jones, and Maurice Herlihy. ACM Conference on Principles and Practice of Parallel Programming 2005 (PPoPP'05).

OTHER TIPS

If there were transactions 1ms long transactions occurring every 100ms, would that mean that a transaction that takes 200ms to process would never complete?

Transactions only conflict if they touch the same TVars, so if some of the 1ms transactions avoided all the variables affected by the 200ms transactions, then the 200ms one would be able to complete. Moreover, since the STM monad is quite strict about what's allowed inside (only memory accesses and pure computations!) it's very unusual to have such disparity between the length of transactions; usually, they will be only a few memory reads/writes long, and all the IO and other computations will be done outside the transaction. Moreover, whether a particular transaction is ever blocked forever by other transactions is a bit of a scheduling problem; I'm not 100% sure what GHC's current scheduler looks like, but it seems plausible that it gives preference to older (or higher failure-rate) transactions.

That said, livelock is a very real problem with STM, and is about as insidious and as difficult to reason about as deadlock in more traditional locking concurrency implementations.

How does TVar work?

You will probably enjoy this paper:

Tim Harris, Simon Marlow, Simon Peyton-Jones, and Maurice Herlihy, "Composable memory transactions", https://doi.org/10.1145/1065944.1065952 (preprint PDF here)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow