Question

I am writing a grammar for a programming language, but I'm running headfirst into a shift/reduce problem. The problem can be found in the state:

fn_call -> ID . L_PAREN fn_args R_PAREN
assignment -> ID . ASSIGN value
assignment -> ID . ASSIGN container
value -> ID

Before explaining a bit further, I want to clarify:

Is this shift/reduce because the program can't determine if I am calling a function or using the ID as a value (eg. constant or variable)?

Moving on, is it possible to fix this? My language does not currently use line delimiters (such as ';' in C or '\n' in Python). The parser is LALR(1).

What is the most efficient (adding the fewest rules to the grammar) way to decipher between a function call or a variable with line delimiters?

EDIT: Here is the lookahead for that state.

  ! shift/reduce conflict for L_PAREN resolved as shift
    L_PAREN         shift and go to state 60
    ASSIGN          shift and go to state 61
    COMMA           reduce using rule 43 (value -> ID .)
    R_PAREN         reduce using rule 43 (value -> ID .)
    DASH            reduce using rule 43 (value -> ID .)
    R_BRACE         reduce using rule 43 (value -> ID .)
    NONE            reduce using rule 43 (value -> ID .)
    DEFN            reduce using rule 43 (value -> ID .)
    FOR             reduce using rule 43 (value -> ID .)
    INT_T           reduce using rule 43 (value -> ID .)
    DBL_T           reduce using rule 43 (value -> ID .)
    STR_T           reduce using rule 43 (value -> ID .)
    ID              reduce using rule 43 (value -> ID .)
    INT             reduce using rule 43 (value -> ID .)
    DBL             reduce using rule 43 (value -> ID .)
    STR             reduce using rule 43 (value -> ID .)
    COMMENT_LINE    reduce using rule 43 (value -> ID .)
    L_BRACE         reduce using rule 43 (value -> ID .)
    SET             reduce using rule 43 (value -> ID .)

  ! L_PAREN         [ reduce using rule 43 (value -> ID .) ]
Was it helpful?

Solution 2

If this is the set of items that form a "state" of the parser, then you haven't written it down right:

fn_call -> ID . L_PAREN fn_args R_PAREN
assignment -> ID . ASSIGN value
assignment -> ID . ASSIGN container
value -> ID .  *missing lookahead set*

You don't exhibit the rest of your language, so we cannot know what the lookahead set is for the rule

 value -> ID

Under the assumption that you indeed have a shift-reduce conflict in this state, then the lookahead set must contain "ASSIGN" or "L_PAREN". I can't tell you how to fix your problem without knowing more.

Given that your present grammar has these issues, you cannot fix this simply "adding rules" of any kind, whether they involve line delimiters or not, because adding rules will not change what is already in lookahead sets (it may add more tokens to existing sets).

EDIT: One way out of your problem may be to switch parsing technologies. Your problem is the LALR parsers cannot handle the local ambiguity that you seem to have. However, your overall grammar may not have an actual ambiguity if you look further ahead. That depends on your language syntax but you are rolling you own so you can do as you please. I suggest looking into GLR parsing technology, which can handle arbitrary lookahead; check out recent versions of Bison.

OTHER TIPS

The following is just guesswork since you haven't shown much of your grammar. I'm assuming that you allow expressions as statements, and not just function calls. In that case, an expression can start with an (, and a statement can end with an ID. Since you have no statement delimiters (I think), then the following is truly ambiguous:

a = b
(c + d)

After reading the b (ID), it is unclear whether to reduce it to value, as part of the assignment, or to leave it as an ID and shift the ( as part of fn_call.

You can't remove ambiguity by adding productions. :)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top