Pluggable vector processing units in Clojure
-
09-10-2019 - |
Pergunta
I'm developing some simulation software in Clojure that will need to process lots of vector data (basically originating as offsets into arrays of Java floats, length typically in 10-10000 range). Large numbers of these vectors will need to go through various processing steps - e.g. normalising the vectors, concatenating together two streams of vectors, calculating a moving average etc.
Rather than doing everything in an imperative style, I was hoping to do was create a more functional-style Clojure solution that would do the following:
- allow any vector function to be turned into a pluggable module, e.g. (def module-a (make-module some-function))
- allow these modules to be composed in pipelines, e.g. (def combined-module (combine-in-series module-a module-b)) would feed the output of module-a into the input of module-b
- allow auxillary functions to access state stored within a given module, e.g. (get-moving-average some-moving-average-module), which would need to work even if some-moving-average-module is embedded deep within a combined pipeline
- hide any boilerplate code behind the scenes, e.g. allocating sufficiently large temporary arrays for vector calculation.
Does this sound like a sensible approach?
If so, any implementation hints or libraries that might help?
Solução
In a functional language, everything is dataflow. You can use functions as your module concept.
To address each of your use-cases:
- A pluggagble module is a Clojure function that takes a single argument that is the state of your data vector. e.g.
(def module-a some-function)
To allow for easy extension by modules, I suggest using a Clojure map as your state, where one field is your array of floats. - Composing modules is function composition. e.g.
(def combined-module (compose module-a module-b)
- Auxiliary functions are accessor functions, extracting state from your data. e.g. If your data is a Clojure map with a
:moving-average
field, then the keyword:moving-average
is your accessor function. State is not stored in modules. - Boilerplate code is hidden in the implementation of your functions, which can be declared anywhere, possibly in another file and namespace.
Outras dicas
Checkout conduit. http://intensivesystems.net/tutorials/conduit-motive.html