Question

I am writing a simple Jison grammar in order to get some experience before starting a more complex project. I tried a simple grammar which is a comma separated list of numeric ranges, with ranges where the beginning and ending values were the same to use a single number shorthand. However, when running the generated parser on some test input I get an error which doe snot make alot of sense to me. Here is the grammar i came up with:

/* description: Parses end executes mathematical expressions. */

/* lexical grammar */
%lex
%%

\s+                   /* skip whitespace */
[0-9]+                {return 'NUMBER'}
"-"                   {return '-'}
","                   {return ','}
<<EOF>>               {return 'EOF'}
.                     {return 'INVALID'}

/lex

/* operator associations and precedence */

%start ranges

%% /* language grammar */

ranges
    : e EOF
        {return $1;}
    ;

e   :  rng { $$ = $1;}
    | e ',' e {alert('e,e');$$ = new Array(); $$.push($1); $$.push($3);}
    ;

rng
    : NUMBER '-' NUMBER
        {$$ = new Array(); var rng = {Start:$1, End: $3; }; $$.push(rng); }
    | NUMBER
        {$$ = new Array(); var rng = {Start:$1, End: $1; }; $$.push(rng);}
    ;

NUMBER: {$$ = Number(yytext);};

The Test input is this:

5-10,12-16

The output is:

Parse error on line 1:
5-10,12-16
^
Expecting '-', 'EOF', ',', got '8'

If it put an 'a' at the front i get and expected error about finding "INVALID" but i dont have an "8" in the input string so i wondering if this is an internal state?

I am using the online parser generator at: http://zaach.github.io/jison/try/

thoughts?

Was it helpful?

Solution

This production is confusing Jison (and it confused me, too :) ):

NUMBER: {$$ = Number(yytext);};

NUMBER is supposed to be a terminal, but the above production declares it as a non-terminal with an empty body. Since it can match nothing, it immediately matches, and your grammar doesn't allow two consecutive NUMBERs. Hence the error.

Also, your grammar is ambiguous, although I suppose Jison's default will solve the issue. It would be better to be explicit, though, since it's easy. Your rule:

e   : rng 
    | e ',' e

does not specify how , "associates": in other words, whether rng , rng , rng should be considered as e , rng or rng , e. The first one is probably better for you, so you should write it explicitly:

e   :  rng
    |  e ',' rng

One big advantage of the above is that you don't need to create a new array in the second production; you can just push $3 onto the end of $1 and set $$ to $1.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top