Question

It seems that there has been a recent rising interest in STM (software transactional memory) frameworks and language extensions. Clojure in particular has an excellent implementation which uses MVCC (multi-version concurrency control) rather than a rolling commit log. GHC Haskell also has an extremely elegant STM monad which also allows transaction composition. Finally, so as to toot my own horn just a bit, I've recently implemented an STM framework for Scala which statically enforces reference restrictions.

All of these are interesting experiments, but they seem to be confined to that sphere alone (experimentation). So my question is: have any of you seen or used STM in the real world? If so, why? What sort of benefits did it bring? What about performance? (there seems to be a great deal of conflicting information on this point) Would you use STM again or would you prefer to use some other concurrency abstraction like actors?

Was it helpful?

Solution

I participated in the hobbyist development of the BitTorrent client in Haskell (named conjure). It uses STM quite heavily to coordinate different threads (1 per peer + 1 for storage management + 1 for overall management).

Benefits: less locks, readable code.

Speed was not an issue, at least not due to STM usage.

Hope this helps

OTHER TIPS

The article "Software Transactional Memory: why is it only a research toy?" fails to look at the Haskell implementation, which is a really big omission. The problem for STM, as the article points out, is that implementations must chose between either making all variable accesses transactional unless the compiler can prove them safe (which kills performance) or letting the programmer indicate which ones are to be transactional (which kills simplicity and reliability). However the Haskell implementation uses the purity of Haskell to avoid the need to make most variable uses transactional, while the type system provides a simple model together with effective enforcement for the transactional mutation operations. Thus a Haskell program can use STM for those variables that are truly shared between threads whilst guaranteeing that non-transactional memory use is kept safe.

We use it pretty routinely for high concurrency apps at Galois (in Haskell). It works, its used widely in the Haskell world, and it doesn't deadlock (though of course you can have too much contention). Sometimes we rewrite things to use MVars, if we've got the design right -- as they're faster.

Just use it. It's no big deal. As far as I'm concerned, STM in Haskell is "solved". There's no further work to do. So we use it.

We, factis research GmbH, are using Haskell STM with GHC in production. Our server receives a stream of messages about new and modified "objects" from a clincal "data server", it transforms this event stream on the fly (by generating new objects, modifying objects, aggregating things, etc) and calculates which of these new objects should be synchronized to connected iPads. It also receives form inputs from iPads which are processed, merged with the "main stream" and also synchronized to the other iPads. We're using STM for all channels and mutable data structures that need to be shared between threads. Threads are very lightweight in Haskell so we can have lots of them without impacting performance (at the moment 5 per iPad connection). Building a large application is always a challenge and there were many lessons to be learned but we never had any problems with STM. It always worked as you'd naively expect. We had to do some serious performance tuning but STM was never a problem. (80% of the time we were trying to reduce short-lived allocations and overall memory usage.)

STM is one area where Haskell and the GHC runtime really shines. It's not just an experiment and not for toy programs only.

We're building a different component of our clincal system in Scala and have been using Actors so far, but we're really missing STM. If anybody has experience of what it's like to use one of the Scala STM implementations in production I'd love to hear from you. :-)

We have implemented our entire system (in-memory database and runtime) on top of our own STM implementation in C. Prior to this, we had some log and lock based mechanism to deal with concurrency, but this was a pain to maintain. We are very happy with STM since we can treat every operation the same way. Almost all locks could be removed. We use STM now for almost anything at any size, we even have a memory manager implement on top.

The performance is fine but to speed things up we now developed a custom operating system in collaboration with ETH Zurich. The system natively supports transactional memory.

But there are some challenges caused by STM as well. Especially with larger transactions and hotspots that cause unnecessary transaction conflicts. If for example two transactions put an item into a linked list, an unnecessary conflict will occur that could have been avoided using a lock free data structure.

I'm currently using Akka in some PGAS systems research. Akka is a Scala library for developing scalable concurrent systems using Actors, STM, and built-in fault tolerance capabilities modeled after Erlang's "Let It Fail/Crash/Crater/ROFL" philosophy. Akka's STM implementation is supposedly built around a Scala port of Clojure's STM implementation. An overview of Akka's STM module can be found here.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top