Question

The material on parser combinators I have found covers building up complex parsers though composition, but I would like to know if there are any good approaches for defining parsers by tweaking the composed parsers of a library without completely duplicating the original library's logic.

For example, here is a simplified CSV parser defined in Real world Haskell

import Text.ParserCombinators.Parsec

csvFile = endBy line eol
line = sepBy cell (char ',')
cell = many (noneOf ",\n")
eol = char '\n'

Assuming csvFile is defined in one library, can another library create its own CSV parser using a custom version of the cell parser without having to rewrite the line and csvFile parsers as well? Can the source library be rewritten to make this possible? This is simple enough for the CSV parser but I am interested in a broadly applicable solution.

Was it helpful?

Solution

Generally you'd need to abstract over the signature of the components you want to replace. For instance, in the CSV example we'd need to extend the type of csvFile to allow us to slot in a custom cell.

line cell = sepBy cell (char ',')
csvFile cell = endBy (line cell) eol

Obviously this gets unwieldy quickly. We can package all of the extension points desired up into a dictionary however and pass it around

data LanguageDefinition =
  LanguageDefinition { cell :: Parser Cell 
                     , ...
                     }

Then we parameterize the entire parser combinator library over this LanguageDefinition

data Parsers = Parsers { line :: Parser Line, csvFile :: Parser [Line], ... }

mkParsers :: LanguageDefinition -> Parsers

This is exactly the approach taken by the generalized Token parsing modules of Parsec: see Text.Parsec.Token and Text.Parsec.Language.


Even more generic approaches can be taken which abstract more and more things into the dictionary that's getting passed around. Effectively this becomes an object- or OCaml module oriented method of organizing code and can be very effective.

The folk Expression Problem states that there's a tension between introducing more functionality and introducing more variants. In this case, you're asking for a new variant so you need to fix the functionality (and list it all out in a dictionary). This will open a path to introduce new variants.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top