質問

I'm working on implementing a music programming language parser in Clojure. The idea is that you run the parser program with a text file as a command-line argument; the text file contains code in this music language I'm developing; the parser interprets the code and figures out what "instrument instances" have been declared, and for each instrument instance, it parses the code and returns a sequence of musical "events" (notes, chords, rests, etc.) that the instrument does. So before that last step, we have multiple strings of "music code," one string per instrument instance.

I'm somewhat new to Clojure and still learning the nuances of how to use reference types and threads/concurrency. My parser is going to be doing some complex parsing, so I figured it would benefit from using concurrency to boost performance. Here are my questions:

  1. The simplest way to do this, it seems, would be to save the concurrency for after the instruments are "split up" by the initial parse (a single-thread operation), then parse each instrument's code on a different thread at the same time (rather than wait for each instrument to finish parsing before moving onto the next). Am I on the right track, or is there a more efficient and/or logical way to structure my "concurrency plan"?

  2. What options do I have for how to implement this concurrent parsing, and which one might work the best, either from a performance or a code maintenance standpoint? It seems like it could be as simple as: (map #(future (process-music-code %)) instrument-instances), but I'm not sure if there is a better way to do it like with an agent, or manual threads via Java interop, or what. I'm new to concurrent programming, so any input on different ways to do this would be great.

  3. From what I've read, it seems that Clojure's reference types play an important role in concurrent programming, and I can see why, but is it always necessary to use them when working with multiple threads? Should I worry about making some of my data mutable? If so, what in particular should be mutable in the code for the parser I'm writing? and what reference type(s) would be best suited for what I'm doing? The nature of the way my program will work (user runs the program with a text file as an argument -- program processes it and turns it into audio) makes it seem like I don't need anything to be mutable, since the input data never changes, so my gut tells me I won't need to use any reference types, but then again, I might not fully understand the relationship between reference types and concurrency in Clojure.

役に立ちましたか?

解決

I would suggest that you might be distracting yourself from more important things (like working out the details of your music language) by premature optimization. It would be better to write the simplest, easiest-to-code parser which you can first, to get up and running. If you find it too slow, then you can look at how to optimize for better performance.

The parser should be fairly self-contained, and will probably not take a whole lot of code anyways, so even if you later throw it out and rewrite it, it will not be a big loss. And the experience of writing the first parser will help if and when you write the second one.

Other points:

You are absolutely right about reference types -- you probably won't need any. Your program is a compiler -- it takes input, transforms it, writes output, then exits. That is the ideal situation for pure functional programming, with nothing mutable and all flow of data going purely through function arguments and return values.

Using a parser generator is usually the quickest way to get a working parser, but I haven't found a really good parser generator for Clojure. Parsley has a really nice API, but it generates LR(0) parsers, which are almost useless for anything which does not have clear, unambiguous markers for the beginning/end of each "section". (Like the way S-expressions open and close with parens.) There are a couple parser combinator libraries out there, like squarepeg, but I don't like their APIs and prefer to write my own hand-coded, recursive-descent parsers using my own implementation of something like parser combinators. (They're not fast, but the code reads really well.)

他のヒント

I can only support Alex Ds point that writing parsers is an excellent exercise. You should definitely do it in C one time. From my own experience, it's a lot of debugging training at least.

Aside from that, given that you are in the beautiful world of Clojure notice the following:

  • Your parser will transform ordinary strings to data structures, like

    {:command :declare, :args {:name "bazooka-violin", ...}, ...}

  • In Clojure you can read such data structures easily from EDN files. Possibly it would be a more valuable approach to play around with finding suitable structures directly before you constrain the syntax of your language too much for it to be flexible for later changes in the way your language works.

  • Don't ever think about writing for performance. Unless your user describes the collected works of Bach in a file, it's unlikely that it will take more than a second to parse.

  • If you write your interpreter in a functional, modular and concise way, it should be easy to decompose it into steps that can be parallelized using various techniques from pmap to core.reducers. The same of course goes for all other code and your parser as well (if multi-threading is a necessity there).

  • Even Clojure is not compiled in parallel. However it supports recompilation (on the JVM) which in contrast is a way more valuable feature to think about.

As an aside, I've been reading The Joy of Clojure, and I just learned that there is a nifty clojure.core function called pmap (parallel map) that provides a nice, easy way to perform an operation in parallel on a sequence of data. It's syntax is just like map, but the difference is that it performs the function on each item of the sequence in parallel and returns a lazy sequence of the results! This can generally give a performance boost, but it depends on the inherent performance cost of coordinating the sequence result, so whether or not pmap gives a performance boost will depend on the situation.

At this stage in my MPL parser, my plan is to map a function over a sequence of instruments/music data, transforming each instrument's music data from a parse tree into audio. I have no idea how costly this transformation will be, but if it turns out that it takes a while to generate the audio for each instrument individually, I suppose I could try changing my map to pmap and see if that improves performance.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top