Question

I'm writing a simple expression parser in Jison allowing an arbitrary number of newlines to follow a binary operator in an expression. This is my grammar so far:

{
    "operators": [
        ["left", "+", "-"],
        ["left", "*", "/", "%"]
    ],
    "bnf": {
        "program": [
            ["statement EOF", "return $1;"]
        ],
        "statement": [
            ["expression newlines", "$$ = $1 + ';';"]
        ],
        "expression": [
            ["NUMBER",                           "$$ = yytext;"],
            ["expression + expression",          "$$ = $1 + ' + ' + $3;"],
            ["expression - expression",          "$$ = $1 + ' - ' + $3;"],
            ["expression * expression",          "$$ = $1 + ' * ' + $3;"],
            ["expression / expression",          "$$ = $1 + ' / ' + $3;"],
            ["expression % expression",          "$$ = $1 + ' % ' + $3;"],
            ["expression + newlines expression", "$$ = $1 + ' + ' + $4;"],
            ["expression - newlines expression", "$$ = $1 + ' - ' + $4;"],
            ["expression * newlines expression", "$$ = $1 + ' * ' + $4;"],
            ["expression / newlines expression", "$$ = $1 + ' / ' + $4;"],
            ["expression % newlines expression", "$$ = $1 + ' % ' + $4;"]
        ],
        "newlines": [
            ["NEWLINE",          ""],
            ["newlines NEWLINE", ""]
        ]
    }
}

As you can see I'm writing two rules for every binary operator. That seems to me to be very redundant. I would rather have a production which matches zero or more NEWLINE tokens (Kleene star) instead of one or more tokens (Kleene plus). How would you do this in Jison?

Was it helpful?

Solution

I use Jison and I ignore white-space (including new-lines).

The first line in my %lex is:

\s+   /* ignore */

But you don't have to do it that way if you don't want to. Try something along these lines:

"expression": [
            ["NUMBER",                           "$$ = yytext;"],
            ["expression + expression",          "$$ = $1 + ' + ' + $3;"],
            ["expression - expression",          "$$ = $1 + ' - ' + $3;"],
            ["expression * expression",          "$$ = $1 + ' * ' + $3;"],
            ["expression / expression",          "$$ = $1 + ' / ' + $3;"],
            ["expression % expression",          "$$ = $1 + ' % ' + $3;"],
            ["expression newlines",              "$$ = $1"],
            ["newlines expression",              "$$ = $2"]
        ],

That should allow any amount of new lines before/after any expression.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top