Question

I'm writing program which translate Pascal to C and need some help. I started with scanner generator Flex. I defined some rules and created scanner which is working more or less ok. It breaks Pascal syntax into tokens, for now it's only printing what it found. But I have no idea what should I do next. Are there any articles or books covering this subject? What is the next step?

Was it helpful?

Solution

Why do you want to do such a Pascal to C converter?

If you just want to run some Pascal programs, it is simpler to use (or improve) existing compilers like gpc, or Pascal to C translators, like e.g. p2c

If you want to convert hand-written Pascal code to humanly-readable (and improvable) C code, the task is much more difficult; in particular, you probably want to convert the indentation, the comments, keep the same names as much as possible -but avoiding clashes with system names- etc!

You always want to parse some abstract syntax tree, but the precise nature of these trees is different. Perhaps flex + bison or even ANTLR may or not be adequate (you can always write a hand-written parser). Also, error recovery may or not be important to you (aborting on the first syntax error is very easy; trying to make sense of an ill-written syntactically-incorrect Pascal source is quite hard).

If you want to build a toy Pascal compiler, consider using LLVM (or perhaps even GCC middle-end and back-ends)

OTHER TIPS

The most common approach would be to build a parse tree in your front end, and then walk through that tree outputting the equivalent C in the back end. This gives you the flexibility to perform any reordering of declarations that's required (IIRC Pascal supports use before declaration, but C doesn't). If you're using flex for the scanner, tradition would dictate using bison for the parser, although there are alternatives. If you look, you can probably find a freely available Pascal syntax in the format expected by bison.

You have to know the Pascal grammar, the C grammar and built (design) a "something" (i.e. a grammar or an automata...) that can translate every Pascal rule in the corresponding C rule.

Than, once you have your tokenized stream, using some method like LR, you can find the semantic tree which correspond to the sequence of Pascal rule applied and convert every rule in the corresponding C rule (this can be easly done with Bison).

Pay attention that Pascal and C have not Context Free grammars, so more control will be necessary.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top