Unit testing a compiler

Question 1

First of all parsing is usually a trivial part of compiler project. From my experience it never takes more than 10% of the time (unless we are talking about C++ but you wouldn't be asking questions here if you were designing it) so you'd rather not invest much of your time into parser tests.

Still, TDD (or however you call it) has it's share in developing the middle-end where you often want to verify that e.g. optimizations that you've just added actually did result in expected code transformation. From my experience, tests like this are usually implemented by giving compiler specially crafted test programs and grepping output assembly for expected patterns (was this loop unrolled four times? did we manage to avoid memory writes is this function? etc.). Grepping assembly isn't as good as analyzing structured representation (S-exprs or XML) but it's cheap and works fine in most cases. It's awfully hard to support as your compiler grows though.

Question 2

Firstly, if you test a compiler, you cannot get enough tests! Users really rely on compiler-generated output as if it were an always-golden-standard, so really be aware of quality. So if you can, test with every test you can come up with!

Secondly, use all testing methods available, yet use them where appropriate. Indeed, you may be able to mathematically prove, that a certain transformation is correct. If you are able to do so, you should.

But every compiler that I've seen some internals of involves heuristics and a lot of optimized, hand-crafted code at its internals; thus assisted proving methods typically are not applicable any more. Here, testing comes in place, and I mean a lot of it!

When collecting tests, please consider different cases:

Positive Standard-Conformance: your frontend should accept certain code patterns and the compiler must produce a correctly running program thereof. Tests in this category either need a golden-reference compiler or generator that produces the correct output of the test-program; or it involves hand-written programs that include a check against values furnished by human reasoning.
Negative Tests: every compiler has to reject faulty code, like syntax errors, type mismatches et cetera. It must produce certain types of error and warning messages. I don't know of any method to auto-generate such tests. So these need to be human-written, too.
Transformation tests: whenever you come up with a fancy optimization within your compiler (middle-end), you probably have some example code in mind that demonstrates the optimization. Be aware of transformations before and after such a module, they might need special options to your compiler or a bare-bone compiler with just that module plugged-in. Test a reasonable big set of surrounding module combinations, too. I usually did regression-testing on the intermediate representation before and after the specific transformation, defining a reference by intensive reasoning with colleagues. Try to write code on both sides of the transformation, i.e. code snippets that you want to have transformed and slightly different ones that must not be.

Now, this sounds like an awful lot of work! Yes it does, but there is help: there are several, commercial test-suites for (C-) compilers out in the world and experts that might help you in applying them. Here a small list of those known to me:

ACE: SuperTest Compiler Test and Validation Suite
Perennial C Compiler Validation Suite
Bugseng: ECLAIR
Tests in the GCC and LLVM environment
you name it! Or google "Compiler Test Suite"