Question

I am implementing a simple C-like language in OCaml and, as usual, AST is my intermediate code representation. As I will be doing quite some traversals on the tree, I wanted to implement a visitor pattern to ease the pain. My AST currently follows the semantics of the language:

type expr = Plus of string*expr*expr | Int of int | ...
type command = While of boolexpr*block | Assign of ...
type block = Commands of command list
...

The problem is now that nodes in a tree are of different type. Ideally, I would pass to the visiting procedure a single function handling a node; the procedure would switch on type of the node and do the work accordingly. Now, I have to pass a function for each node type, which does not seem like a best solution.

It seems to me that I can (1) really go with this approach or (2) have just a single type above. What is the usual way to approach this? Maybe use OO?

Was it helpful?

Solution

Nobody uses the visitor pattern in functional languages -- and that's a good thing. With pattern matching, you can fortunately implement the same logic much more easily and directly just using (mutually) recursive functions.

For example, assume you wanted to write a simple interpreter for your AST:

let rec run_expr = function
  | Plus(_, e1, e2) -> run_expr e1 + run_expr e2
  | Int(i) -> i
  | ...

and run_command = function
  | While(e, b) as c -> if run_expr e <> 0 then (run_block b; run_command c)
  | Assign ...

and run_block = function
  | Commands(cs) = List.iter run_command cs

The visitor pattern will typically only complicate this, especially when the result types are heterogeneous, like here.

OTHER TIPS

If you have to write many different recursive operations over a set of mutually recursive datatypes (such as an AST) then you can use open recursion (in the form of classes) to encode the traversal and save yourself some boiler plate.

There is an example of such a visitor class in Real World OCaml.

It is indeed possible to define a class with one visiting method per type of the AST (which by default does nothing) and have your visiting functions taking an instance of this class as a parameter. In fact, such a mechanism is used in the OCaml world, albeit not that often.

In particular, the CIL library has a visitor class (see https://github.com/kerneis/cil/blob/develop/src/cil.mli#L1816 for the interface). Note that CIL's visitors are inherently imperative (transformations are done in place). It is however perfectly possible to define visitors that maps an AST into another one, such as in Frama-C, which is based on CIL and offer in-place and copy visitor. Finally Cαml, an AST generator meant to easily take care of bound variables, generate map and fold visitors together with the datatypes.

The Visitor pattern (and all pattern related to reusable software) has to do with reusability in an inclusion polymorphism context (subtypes and inheritance). Composite explains a solution in which you can add a new subtype to an existing one without modifying the latter one code. Visitor explains a solution in which you can add a new function to an existing type (and to all of its subtypes) without modifying the type code. These solutions belong to object-oriented programming and require message sending (method invocation) with dynamic binding.

You can do this in Ocaml is you use the "O" (the Object layer), with some limitation coming with the advantage of having strong typing.

In OCaml Having a set of related types, deciding whether you will use a class hierarchy and message sending or, as suggested by andreas, a concrete (algebraic) type together with pattern matching and simple function call, is a hard question.

Concrete types are not equivalent. If you choose the latter, you will be unable to define a new node in your AST after your node type will be defined and compiled. Once said that a A is either a A1 or a A2, your cannot say later on that there are also some A3, without modifying the source code.

In your case, if you want to implement a visitor, replace your EXPR concrete type by a class and its subclasses and your functions by methods (which are also functions by the way). Dynamic binding will then do the job.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top