Question

Possible Duplicates:
How can a language's compiler be written in that language?
implementing a compiler in “itself”

I was looking at Rubinius, a Ruby implementation that compiles to bytecode using a compiler written in Ruby. I cannot get my head around this. How do you write a compiler for a language in the language itself? It seems like it would be just text without anything to compile it into an executable that could then compile the future code written in Ruby. I get confused just typing that sentence. Can anyone help explain this?

Was it helpful?

Solution

To simplify: you first write a compiler for the compiler, in a different language. Then, you compile the compiler, and voila!

So, you need some sort of language which already has a compiler - but since there are many such, you can write the Ruby compiler compiler (!) e.g. in C, which will then compile the Ruby compiler, which can then compile Ruby programs, even further versions of itself.

Of course, the original compilers were written in machine code, compiled compilers for assembly, which in turn compiled compilers for e.g. C or Fortran, which compiled compilers for...pretty much everything. Iterative development in action.

The process is called bootstrapping - possibly named after Baron Munchhausen's story in which he pulled himself out of a swamp by his own bootstraps :)

OTHER TIPS

Regarding the bootstrapping of a compiler it's worth reading about this devilishly clever hack.

http://catb.org/jargon/html/B/back-door.html

I get confused just reading that sentence.

It may help to think of the compiler as a translator, which compilers are often called. Its purpose is to take source code that humans can read and translate it into binary code that computers can read. In the case of Rubinius, the code that it reads happens to be Ruby code, and the code that it converts it into is machine code (actually LLVM machine code which is itself further compiled into Intel machine code, but that's just a background detail). Rubinius itself could have been written in just about any programming language. It just happened to have been written in the same language that it compiles.

Of course, you need something to run Rubinius in the first place, and this most likely a regular Ruby interpreter. Note, however, that once you are able to run Rubinius on an interpreter, you can pass it its own source code, and it will create and run a compiled version of itself. This is called bootstrapping, from the old phrase, "pulling yourself up by the bootstraps".

One final note: Ruby programs can't invoke arbitrary machine code. That part of Rubinius is actually written in C++.

Well it is possible to do it in the following order:

  1. Write a compiler in any language, say C for your Ruby code.
  2. Now that you can compile Ruby code, you can write a compiler that compiles ruby code and compile this compiler with the C compiler you wrote in step 1. wahh this sentence is strange!
  3. From now on you can compile all your ruby code with the compiler written in 2. :)

Have fun! :)

A compiler is just something that transforms source code into an executable. So it doen't matter what it is written in - it can be the same language it is compiling or any other language of sufficient power.

The fun comes when you are writing a compiler for a language for a platform, written in the same language, that doesn't yet have a compiler for your implementation language. Your choices here are to compile on another platform for which you do have a compiler, or write a compiler in another language, and use that to compile the "real" compiler.

It's a 2 step process:

  1. write a Ruby compiler in some other lanaguage like C, assuming a Ruby compiler doesn't yet exist
  2. since you now have a Ruby compiler, you can write a Ruby program that is a (new) Ruby compiler

Since somebody already wrote a Ruby compiler (Matz), you "only" have to do the second part. Easier said than done.

All of the answers so far have explained how to bootstrap the compiler by using a different compiler. However, there is an alternative: compiling the compiler by hand. There's no reason why the compiler has to be executed by a machine, it can just as well be executed by a human.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top