Using Clojure DataFlow programming idioms

https://stackoverflow.com/questions/4565158

14-10-2019
|

Question

Can someone explain why and how I would use the Clojure Dataflow programming API as I can't seem to find much about it on the internet.

Solution

I think it's most helpful to read other info on what dataflow programming is. Imho, the Groovy GPars guys have some of the best docs on dataflow. The GPars mailing lists have had lots of discussion about dataflow vs CSP vs actors etc in the past and is a great place to ask questions.

Some other useful links:

Data Flow concurrency in Groovy - Vaclav Pech (GPars)
Flowing with the data - Vaclav Pech (GPars)
Select dataflow - Vaclav Pech (GPars)
An example from GPars article - Alex Miller
Presentation - Jonas Boner (Akka / Scala)
Wikipedia of course

The Clojure impl is pretty bare-bones, basically building dataflow variables on top of refs and watch functions on those refs. You might find the actual code or the tests more useful than the docs.

The canonical example cited with dataflow variables is that of a spreadsheet, where each variable is a cell in the spreadsheet defined by the values from other cells. When one cell changes, the changes ripple forward in dependency order. Dataflow variables themselves are somewhat limited though - I think dataflow streams are where the idea gets more interesting.

In some sense, the idea of lightweight processing nodes scheduled over a (usually smaller) set of fixed threads, each connected by queue-like streams describes at a very high level all of {dataflow streams, actors, CSP}. The goal being in all cases to maintain high throughput by keeping nodes that have work to do working and not wasting cycles on nodes that don't AND to avoid users managing explicit threads and locks (decoupled via the queues/streams/channels in between them).

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow