Pregunta

What is this called, when a program takes source code as an input and spits out new source code and perhaps immediately runs the new code?

Examples:

  1. Automated refactoring
  2. Taking a function and turning it into a GUI (turning the function inputs into input boxes)
  3. Adding new capabilities to a function, like result caching.

The easiest languages to do this with are functional languages with simple syntax, right? (lisp, scheme, etc.)?

But you can do it with any language or between languages as long as you can parse source code into a tree or other data structure, right? (Using antlr or some other tool?)

So, what is this called? What are other examples of doing this, both big and small? What are some common tools? Please give me a jumping off point for thorough understanding of this type of programming.

(I am trying to leverage what I, as a single programmer, one pair of hands, can do, by writing code that will do things with my code.)

¿Fue útil?

Solución

What you are interested in are "source-to-source" program transformation systems.

A parser is necessary but hardly sufficient. (In fact, a parser for your dialect of the langauge you are using is necessary... given that people use so many "programming" languages, this is a lot of parsers necessary). You also need to capture the parse result (generally as a tree), be able to manipulate that tree somehow, and then regenerate valid source code from the tree. If you want that source code to be a permanent change to source manually programmed, the parser/tree/unparser combination must preserve code formatting and comments to maximal extent practical.

Often you can't really transform code unless the meaning of the identifiers is clear; so in practice you need to not only parse but construct symbol tables (at this point you have what amounts to a compiler front end, not a parser). Many transformations require tracking information flows (control flow, dataflow, points-to, who-calls, ...). Without these features, these tools are not very effective for the procedural/OO languages that constitute by far the bulk of source code. (The functional languages guys can get by with less flow analysis because everything is an expression, but most code written isn't functional, so this doesn't matter in a practical sense).

There are a number of source-to-source transformation systems, some language specific (e.g., JackPot for Java); a few are parameterized by langauge definitions (TXL, Stratego, DMS).

Some of the langauge specific one provide symbol tables and flow analysis. I know of only one transformation that provides these capabilities for a number of programming languages, and that's our DMS Software Reengineering Toolkit.

DMS provides this for a number of real programming languages and their common dialects, not toys: C, Java, COBOL.

DMS is one of the few source-to-source program transformation system for C++; it has been used in anger to make changes to large C++ programs. Clang is pretty close, I think, certainly in ambition; Rose Compiler is another used mostly in supercomputing circles, but both of these are C++ (perhaps with C thrown in) specific.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top