Question

For my AQA A2-level Computing project, I've decided to create a basic interpreted programming language, outputting to Console. I don't know how to build an interpreter. I have a copy of the purple dragon book, which is all about compiler design, as user166390 said on an answer to this question that the initial steps to building a compiler are the same to build an interpreter. My question is: is this true?

Can I use the techniques described in the dragon book to write an interpreter? And if so, which steps do I need to use and learn how to use?

Do I need to write a lexical analyser, a syntax analyser, a semantic analyser and an intermediate code generator, for example?

Could I get away with writing a basic parser that reads each line of the source code, parses it, and executes the instruction straight away, or is that a notoriously bad idea?

Was it helpful?

Solution

Yes, you can use the techniques described in the dragon book to write an interpreter.

You need a lexical analyzer and a parser regardless.

As others have pointed out, you do need to write the code to do actual execution -- but for a simple interpreter, this can be essentially the same as the syntax-directed translation described in the dragon book.

Everything else is optional.


If you want to skip straight from the parser to execution, you can. That will leave you with a very simple language, which can be both good and bad -- look at Tcl for an example of such a language.

If you want to interpret each line as you parse it, you can do that, too; this is what most command-line interpreters (Unix shell scripts, Microsoft's cmd.com and PowerShell) do, as well as interactive "REPL's" (Read-Eval-Print-Loops) for languages like Python and Ruby.

"Semantic analyzer" seems vague to me, but sounds like it should include most kinds of load-time consistency checks. This is also optional, but there are advantages in an interpreter that won't take any old garbage and try to execute it as a program...

"Intermediate code" is also kind of vague, but it is arguably optional. If you aren't executing directly from the program string (as in Tcl), you need some kind of internal representation to store your code once you've read it in. One popular option is to execute from an internal tree structure, based more or less closely on your parse tree, which is arguably distinct from producing "intermediate code". On the other hand, if your "intermediate code" could be written out more or less directly from your internal tree structure, you might as well count the internal structure as your "intermediate code".


There are important issues that you haven't addressed; one that stands out is: how do you want to handle names? Presumably you will want the programmer to be able to define and use his own names (e.g., for variables, functions, and so forth), so you will need to implement some kind of mechanism for that.

Exactly how names are handled is a big design decision, with major implications for the usability and implementability of your language. The simplest option for implementation is to use a single, global hash map to implement a single, global namespace -- but note that this choice has well-known usability problems...

OTHER TIPS

Could I get away with writing a basic parser that reads source code and executes the steps straight away?

You could but you'd be doing it the hard way.

Do I need to write a lexical analyser, a syntax analyser, a semantic analyser and an intermediate code generator, for example?

You can skip intermediate code generation except if you want to write a VM-based interpreter. Perl for example, used to execute its parse graph directly; this is in contrast with Java or Python, which produces intermediate byte code.

The interpreter part of a VM-based language is generally simpler than the interpreter that have to understand a parse graph (so each component in the system is simpler), however the complexity of the whole interpreter stack is generally simpler when you don't need to define an intermediate bytecode language. So pick your poison.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top