Is it really always easier to write an interpreter than a compiler?

https://softwareengineering.stackexchange.com/questions/400468

03-03-2021
|

Question

I've read that generally it's easier to write an interpreter than a compiler. If that's true, what's the reason? Writing an interpreter seems to me equivalent in the level of difficulty as writing a compiler, provided that the compiler simply translates the code to another high-level language (like c), and then uses an already existing compiler (like gcc) to that high-level language. I'd very much appreciate a clarification about this issue, please.

Solution

Yes, it is ! But only if you have a LL(k) grammar, where k is 0 or 1 .

The reason is that you can parse the language using an easy functional decomposition corresponding roughly to the grammar terms. The functions then processes the input string from left to right:

You may then interpret as you parse. But with slow performance.
If you're more advanced, you may build an abstract syntax tree using for example the interpreter pattern and interpret the tree after the parsing. This leads to a more performant interpretation, since you do not need to reparse the input code every time you reexecue it.

Writing a compiler needs much more knowledge and skills:

you would need to translate the syntax tree in assembler or VM instructions, or some other kind of intermediate format; this is a level of complexity really higher rather than just doing the stuff in the interpreter and requires you to cope with the limitations of the target machine (limited number of registers, etc...).
You'd also have to do the optimization one would expect from a decent compiler (at least constant propagation and loop optimization). This requires some understanding of graph algorithms.
The output must then be produced for allowing integration into a tool suite that you do not control. This may also involve knowledge about linking process, and calling conventions needed to work with other languages or libraries...

On the other side, writing compiler usually involves parser generators, that can digest much more complex grammars than LL(1) (e.g. LALR(1)). This allows you to put more focus on the code generation rather than on routine parsing issues.

Edit: If you consider transpilers that convert source code from one language to another, it will depend on how close the source and target languages are, including from a semantic point of view. A general statement is therefore difficult to make. At best, you'd have a complexity comparable with an interpreter (instead of interpreting, rewriting the code with some simple transformations). But in many cases , semantic analysis and additional code-generation mechanisms will be required to cope with language gaps (e.g. different object models and lifecycles, run-time vs-compile time, ect...); you'll then face at least some of the complexities of full compilers.

OTHER TIPS

The statement was probably about compilers which compile to machine code. If you include compilers which compile one high level language to another it might be a lot easier. E.g. compiling C# to VB.Net is probably relatively easy. But compiling a high-level language to efficient machine code is non-trivial.

Many high-level languages do as you suggest, compile to C and then use an existing C compiler to generate machine code. This is exactly to avoid the complexity of compiling to machine code and instead reuse an existing compiler. Furthermore C compilers exist for a lot of architectures, so you get a wide reach "for free". Platforms like .net and the JVM use an intermediate byte-code format for the same purpose.

Compared to this, writing an interpreter is relatively easy, because you can use features in the underlying language. E.g. if you write an interpreter in a garbage-collected language, you basically get garbage collection for free. So writing an interpreter can be fairly simple.

Interpreter or compiler, you have to analyse the source code and figure out what each statement means. In some languages the effort for that is significant. And in these languages, while writing an interpreter might be cheaper, it isn't that much cheaper.

But you don't just want "an interpreter" or "a compiler". You want something that runs code quickly. With an interpreter, the speed at which you run code is severely limited, unless the language works at a very high level. Even the simplest compiler will produce code that will require enormous effort in an interpreter.

So if you have a requirement for speed, I would conjecture that whatever your requirements are, a compiler will get to those speed requirements cheaper than an interpreter.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange