Frage

I'm trying to start, figuring out, how creating a simple programming language work. Both with the syntax and the compiler itself. I've done some research on this topic, but I really don't get what my true question is all about.

I would think, that existing programming languages- compilers, is built on already existing programming languages, and therefore it would only make sense, to also base my compiler, on one of these languages.

Altho, since this in theory, the very first language with a compiler, didn't have another language to be based on, this can't be a true fact, and really must be based on something else, like the core Computer System language.

Which way is the best way to go, aswell as how, to get to my goal, which is creating a simple (With room to expanding) programming language?

Any answer is appreciated!

War es hilfreich?

Lösung 2

Which way is the best way to creating a simple programming language?

Unlike a majority of people I don't believe that creating a language is about using a compiler or interpreter. While you will most likely need a compiler or interpreter to implement your new language, they are tools just as is a pencil and paper. Don't start by using a tool and think you have accomplished something. It would be like using a wrench to make an engine that doesn't work, but you claim you made an engine because use used a wrench.

To create a good programming language you have to have goal for your language.

Since you mention programming language as opposed to some other type of language such as SQL, or a markup language such as HTML, I will take it that you want a Turing complete language.

Since most Turing complete languages support arithmetic I would start with a simple arithmetic expression language and build on that. There are a huge amount of examples of these on the Internet, but be fore warned that many have problems.

Next learn how to build Abstract Syntax Trees (AST) for arithmetic expressions. i.e.

3 + 2 * 6

    +
   / \
  3   *
     / \
    2   6

Do not use a compiler to build the AST, but build them by hand in the language you are using to write your programming language. i.e. If you are using Java to create a C++ compiler, then create the AST using Java.

Then write an evaluator for the AST that will walk the tree.

Once you are able to correctly build an AST and evaluate then add the lexer/parser which translates human readable source code into an AST. This is were you will need to get a good compiler design book.

Now you can compile the AST into assembly or byte code or just continue using an evaluator.

From this point on you just add features to your language, again starting by with the AST and then modifying the parser and code generator if you implemented one.

How to create a simple (with room to expanding) programming language?

As I noted: start with an arithmetic evaluator and add language concepts one at a time. Since you are new at this, you may find that a concept you add is actually better as a composition of simpler concepts and that you should add one of the simpler concepts first before adding the other concept finally reaching the higher concept.

Because your question is so general I can't give more specific answers. I see that you already have a few close votes noting such.

Andere Tipps

The very first compilers were based on assembler coding. Where did the assemblers come from?

The very first assemblers were based on painfully entered raw binary machine code instructions.

Hardly anybody enters binary; at very least, some kind of debugger program is used to do this. Hardly anybody codes compilers using assemblers anymore either; in many cases, a first compiler for a language is coded in C.

If you want to build a programming language, your first step is to get a compiler book (google "compiler book") and read it from cover to cover. If you try to avoid this step, you'll spend a huge amount of energy to try and invent what you need, and you'll likely fail.

Key tools for building compilers are parser generators, and program transformation systems. The former is the classic answer. The latter is a high-tech answer, and isn't very common, but can produce language processing tools much more quickly than classic answers. You need the compiler book background to understand these tools.

If you want to build an unlimited extensibility into your language, consider implementing a simple metaprogramming system in it.

This way you can start with some very simple and small language, and then build an arbitrary complex language or a set of different languages by extending it with its own macros. Such language can be trivially turned into any other language.

Take a look at Forth and Lisp - both can be built upon some extremely trivial core which is then extended to a fully capable language. You don't even need any other high level language to implement such a chain: a simple Forth can be bootstrapped in about a couple of hundred lines of x86 assembly.

If you're determined enough, you can even skip assembler and write in machine code straight away, for something of this scale it's quite manageable in a reasonable time and might give you some indispensable experience.

well inventing a language is inventing a language...how you implement it you usually use an existing language and then at some point assuming your new language is such that it can be used as a compiler, then you write a compiler in your new language and you use the binary from the current language to compile the same language compiler, then you do it once more with the binary from the same language compiler if that all works you are self-hosting. a compiler that can compile its own language compiler.

If you have never made a language or compiler then you are a long long way from that, you might try one of the many examples on line of a simple C like compiler that can only do some simple things (and can never self-compile), get your feet wet with something like that.

At the end of the day the programming language to be useful has to compile down to something, ideally machine code be it a real machine or virtual like python or java or old pascal. But sometimes one language compiles down to another known language, C++ for example, and then you use existing tools for that language to take down to something can execute.

It has been asked and answered a number of times now. If you go far enough back or want to get as pure as you can you start with machine code and a way to enter it (see the many computers for this, dec pdp series, altair, etc, the entry method being address, data and clock manual switches). The "compiler" or in the case of assembly/machine code the "assembler" is the human with paper and pencil or pen if you are that good. You manually write out your assembly language, you then manually convert that to machine code, then you manually flip switches to enter the program into ram then you manually push the run button. The first assemblers and then later compilers were written this way, you make an assembler using machine code using a human assembler, then self host that. Then you either use the human assembler or software assembler to the write your first compiler for your first ever non-assembly language, then you re-write the compiler in the new language, then you self-host that. Repeat until it is present day and there are more compilers and languages that you could ever master and a myriad of choices of editors and languages to build a compiler for a new language upon.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top