Напишите компилятор для языка, который смотрит вперед и несколько файлов?

https://stackoverflow.com/questions/1578835

21-09-2019
|

Вопрос

На моем языке я могу использовать переменную класса в моем методе, когда определение появляется ниже метода. Это также может вызвать методы ниже моего метода и т. Д. Нет «заголовков». Возьмите этот C# пример.

class A
{
    public void callMethods() { print(); B b; b.notYetSeen();
    public void print() { Console.Write("v = {0}", v); }
    int v=9;
}

class B
{
    public void notYetSeen() { Console.Write("notYetSeen()\n"); }
}

Как мне это собрать? Я думал:

Pass1: преобразовать все в AST
PASS2: пройдите все классы и составьте список определенных классов/переменных/и т. Д.
PASS3: Пройдите код и проверьте, есть ли какие -либо ошибки, такие как неопределенная переменная, неправильное использование и т. Д., И создайте мой выход

Но, кажется, что это работает, я должен пройти 1 и 2 для всех файлов, прежде чем выполнять Pass3. Кроме того, мне кажется, что большая работа, пока я не найду синтаксисную ошибку (кроме очевидного, которое можно сделать во время анализа, например, забыть о закрытии скобки или написание 0 -х ответов вместо шестигранного значения). Моя кишка говорит, что есть другой путь.

Примечание: я использую Bison/Flex для генерации моего компилятора.

Решение

Мое понимание языков, которые обрабатывают прямые ссылки, заключается в том, что они обычно просто используют первый проход, чтобы составить список действительных имен. Что -то вроде того, чтобы просто поместить вход в таблицу (без заполнения определения), поэтому у вас есть кое -что, на что можно указать позже, когда вы выполняете свой реальный пас, чтобы генерировать определения.

Если вы попытаетесь построить полные определения по мере продвижения, вам в конечном итоге придется отменить откровенно, каждый раз сохраняя любые ссылки на неопределенные вещи до следующего прохода. Даже это не удалось бы, если есть круговые ссылки.

Другие советы

I would go through on pass one and collect all of your class/method/field names and types, ignoring the method bodies. Then in pass two check the method bodies only.

I don't know that there can be any other way than traversing all the files in the source.

I think that you can get it down to two passes - on the first pass, build the AST and whenever you find a variable name, add it to a list that contains that blocks' symbols (it would probably be useful to add that list to the corresponding scope in the tree). Step two is to linearly traverse the tree and make sure that each symbol used references a symbol in that scope or a scope above it.

My description is oversimplified but the basic answer is -- lookahead requires at least two passes.

The usual approach is to save B as "unknown". It's probably some kind of type (because of the place where you encountered it). So you can just reserve the memory (a pointer) for it even though you have no idea what it really is.

For the method call, you can't do much. In a dynamic language, you'd just save the name of the method somewhere and check whether it exists at runtime. In a static language, you can save it in under "unknown methods" somewhere in your compiler along with the unknown type B. Since method calls eventually translate to a memory address, you can again reserve the memory.

Then, when you encounter B and the method, you can clear up your unknowns. Since you know a bit about them, you can say whether they behave like they should or if the first usage is now a syntax error.

So you don't have to read all files twice but it surely makes things more simple.

Alternatively, you can generate these header files as you encounter the sources and save them somewhere where you can find them again. This way, you can speed up the compilation (since you won't have to consider unchanged files in the next compilation run).

Lastly, if you write a new language, you shouldn't use bison and flex anymore. There are much better tools by now. ANTLR, for example, can produce a parser that can recover after an error, so you can still parse the whole file. Or check this Wikipedia article for more options.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow