Pergunta

I have an abstract syntax tree which I want to compile down to different representations. I am now struggling to arrange the classes in a way that new representations can be added easily.

The easiest way to achieve this is to add a method for each representation, e.g. compile_to_foo, compile_to_bar. Additional representations can be added by monkey patching. The problem with this is that the compilation implementations are spread all over the place, and that it violates the single responsibility principle. The advantage is that compilation can be inherited.

Now, I could also define a compilation function containing a giant switch which dispatches on argument type. But this looses advantages of polymorphism, and makes inheriting behavior of the compilation more difficult. This is not a viable option.

An interesting solution would use an abstract factory:

Abstract Factory

This design looks fairly promising, but has some disadvantages:

  • The AST Node hierarchy cannot be extended without also extending the AbstractCompiler, and in turn all concrete compilers and their parallel hierarchies of concrete nodes.
  • The subtyping information of the AST nodes is spead across the whole system. it has to be specified between the AbstractNodes in order to share behaviour (I will be using roles), between the ConcreteNodes in order to share compile implementations, and in at least in the AbstractCompiler to provide default implementations (e.g. method NodeA() { return Node() }). This could be partially solved via metaprogramming.
  • When an AST is built, this can only compile down to one representation. If I want to have multiple outputs, I need to rebuild the AST with a different ConcreteCompiler.

Ideally, I would just pass a concrete compiler instance as a parameter to the compile method:

Ideal class diagram

… but I have no idea how the compile method could obtain the actual implementation from a parallel class hierarchy (without again using a giant switch on the node type).

I also investigated the Bridge pattern, but the solution does not seem applicable to my problem without creating a thousand little bridges.

I carefully read through this previous question: “Designing a robust architecture for multiple export types?”. The key difference is that the input data (there: equivalent, standalone data represenations) are now hierarchical AST nodes, so that inheritance between the compilation implementations is crucial.

The system will be implemented in Perl, so I'm not restricted to classic OOP, but can also use Metaprogramming, Roles (aka. traits), and Functional Programming.

What am I missing? Is there an architecture I could use to elegantly structure this system? How can I make the corresponding class from the parallel class hierarchy discoverable to the node classes, without sacrificing polymorphism?

Foi útil?

Solução

If I'm understanding your problem correctly this is exactly the kind of thing the Visitor pattern was designed to solve. Your AST nodes should implement a visit() method that takes a reference to a Visitor base class. The visit method on the AST node calls a method on the Visitor object that corresponds to the type of node it is (passing along any relevant information) then passes the Visitor node to its children.

The Visitor object is what emits your different representations. It has callbacks for each type of AST node you are going to visit, and simply gets passed to each AST node via the node's visit() method. The Visitor will also track any necessary state, etc. If you do things this way then you can easily switch out what you compile to by implementing different Visitors (or specialize similar Visitors in different ways via inheritance, etc.)

This is essentially a way of implementing your "giant switch" in a cleaner more easily extensible way. The only downside is that if you add new types of AST node, you may have to add a new callback for the node type in all your Visitor objects.

Licenciado em: CC-BY-SA com atribuição
scroll top