سؤال

I am currently writing a parser with yecc in Erlang.

Nonterminals expression.

Terminals '{' '}'  '+' '*' 'atom' 'app' 'integer' 'if0' 'fun' 'rec'.

Rootsymbol expression.

expression -> '{' '+' expression  expression '}' : {'AddExpression', '$3','$4'}.
expression -> '{' 'if0' expression expression expression '}' : {'if0', '$3', '$4', '$5'}.
expression -> '{' '*' expression expression '}' : {'MultExpression', '$3','$4'}.
expression -> '{' 'app' expression expression '}' : {'AppExpression', '$3','$4'}.
expression -> '{' 'fun' '{' expression '}' expression '}': {'FunExpression', '$4', '$6'}.
expression -> '{' 'rec' '{' expression expression '}' expression '}' : {'RecExpression', '$4', '$5', '$7'}.
expression -> atom : '$1'.
expression -> integer : '$1'.

I also have an erlang project that tokenizes the the input before parsing:

tok(X) ->
element(2, erl_scan:string(X)).

get_Value(X)->
 element(2, parse(tok(X))).

These cases are accepted:

interp:get_Value("{+ {+ 4 6} 6}").
interp:get_Value("{+ 4 2}"). 

These return: {'AddExpression' {'AddExpression' {integer, 1,6} {integer,1,6}}{integer,1,6}} and {'AddExpression' {integer,1,4} {integer,1,2}}

But this test case:

interp:get_Value("{if0 3 4 5}").

Returns:

{1,string_parser,["syntax error before: ","if0"]}
هل كانت مفيدة؟

المحلول

In the grammar rules what you are showing are the category of the terminal tokens and not their values. So you can match against an atom but not against a specific atom. If you are using the Erlang tokenizer then the token generated for "if0" will be {atom,Line,if0} while in you grammar you want a {if0,Line} token. This is what the "Pre-processing" section of the yecc documentation is trying to explain.

You will need a special tokenizer for this. A simple way of handling this if you want to use the Erlang tokenizer is have a pre-processing pass which scans the token list and converts {atom,Line,if0} tokens to {if0,Line} tokens.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top