Question

I am wondering how the syntax analysis and semantic analysis work.

I have finished the lexer and the grammar construction of my interpreter.

Now I am going to implement a recursive descent (top down) parser for this grammar

For example, I have the following grammar:

<declaration>  ::=   <data_type> <identifier> ASSIGN <value>

so i coded it like this (in java):

public void declaration(){
    data_type();
    identifier();
    if(token.equals("ASSIGN")){
        lexer();   //calls next token
        value();
    } else {
        error();
    }
}

Assuming I have three data types: Int, String and Boolean. Since the values for each data types are different, (ex. true or false only in Boolean) how can I determine if it fits the data type correctly? What part of my code would determine that?

I am wondering where would I put the code to:

1.) call the semantic analysis part of my program. 
2.) store my variables into the symbol table.

Do syntax analysis and semantic analysis happen at the same time? or do i need to finish the syntax analysis first, then do the semantic analysis?

I am really confused. Please help.

Thank you.

Was it helpful?

Solution

You can do syntax analysis (parsing) and semantic analysis (e.g., checking for agreement between <data_type> and <value>) at the same time. For example, when declaration() calls data_type(), the latter could return something (call it DT) that indicates whether the declared type is Int, String or Boolean. Similarly, value() could return something (VT) that indicates the type of the parsed. Then declaration() would simply compare DT and VT, and raise an error if they don't match. (Alternatively, value() could take a parameter indicating the declared type, and it could do the check.)

However, you may find it easier to completely separate the two phases. To do so, you would typically have the parsing phase build a parse tree (or abstract syntax tree). So your top level would call (e.g.) program() to parse a whole program, which would return a tree representing (the syntax of) the program, and you would pass that tree to a semantic_analysis() routine, which would traverse the tree, extracting pertinent information and enforcing semantic constraints.

OTHER TIPS

The short answer is: it depends on the definition of your programming language. And, since you only specified one derivation rule and three native types, there is no way of knowing. For example, if your programming language allows forward declarations like the c++ code below, then handling the derivation rule for function declaration (foo) is done without knowing the type of the variable serial

class Tree {
public:
    int foo(void)
    {
        return serial;
    }
    int serial;
};

Indeed, modern compilers separate the syntax analysis phase from the semantic analysis phase. The syntax analysis phase is performed first, making sure the input program agrees with the context free grammar of the language. And, in addition, produces an Abstract Syntax Tree (AST). Note the difference between an AST and a parse tree, as discussed in this SO post. The semantic analysis phase then traverses the AST and checks for type mismatches among other things.

Having said that, toy programming languages can sometimes couple semantic and syntax analysis together. When a recursive descent parser is used, you should have relevant recursive calls return a type.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top