Compiler code optimization: AST vs. IR

Question 1

(a) Yes.

(b) Constant folding is one example; CSE is another; in fact almost anything to do with expression evaluation. IR-phase optimizations are more about what results from flow analysis.

Question 2

IR is a form of an AST (often it is "flattened", but there are deep tree IRs as well), it may not be easy to distinguish one from another, especially if compiler is implemented as a sequence of very small rewrites from an original AST all the way down to a final IR suitable for instruction selection.

Optimisations may happen anywhere on this chain, but some representations are more suitable for a wide range of optimisations, most notably, an SSA form, used by most of the modern compilers to do nearly all the optimisations.

Question 3

It's never too early to optimise (to coin a phrase). So there are optimisations performed before and during AST creation, on the AST itself, on the IR (if you have one) and on the code as it is generated. In C-like languages and those that compile to machine code, the effort goes into the later stages. In compilers targeting a VM I think there is less room for improvements at that stage.

Some early optimisations obviously work better than others. I don't know much about Decaf, but there are the obvious things like constant folding and constant expression evaluation. If you get the whole program in tree form before you have to generate any code you can find common subexpressions, do code migration, eliminate dead code/dead stores, hoist invariants, eliminate tail recursion and some kinds of strength reduction.

A lot of it depends on how hard you want to work and what your target is. You didn't say much about that.