Question

The following grammar suffers from the dangling else problem, even though I've tried to solve it after reading http://marvin.cs.uidaho.edu/~heckendo/CS445/danglingElse.html I'm wondering if you can spot what I've done wrong ...

%{
%}

%token PROGRAM CONST TYPE ARRAY LIST SET OF RECORD VAR FUNCTION PROCEDURE
%token INTEGER REAL BOOLEAN CHAR FORWARD LENGTH NEW T_BEGIN END IF THEN ELSE
%token WHILE DO CASE OTHERWISE FOR TO DOWNTO WITH READ WRITE
%token LISTFUNC SEMI 
%token CCONST BCONST STRING RCONST ICONST ID
%token RBRACK RPAREN COMMA  ASSIGN DOTDOT COLON
%token LBRACK INOP RELOP EQU ADDOP OROP MULDIVANDOP NOTOP DOT LPAREN

%nonassoc LBRACK 
%nonassoc INOP RELOP EQU 
%left ADDOP OROP
%left MULDIVANDOP
%nonassoc NOTOP
%left DOT LPAREN 

%%

program : header declarations subprograms comp_statement DOT 
;

header : PROGRAM ID SEMI
;

declarations : constdefs typedefs vardefs 
;

constdefs : CONST constant_defs SEMI
|
;
constant_defs : constant_defs SEMI ID EQU expression 
| ID EQU expression 
;

expression : expression RELOP expression 
| expression EQU expression  
| expression INOP expression  
| expression OROP expression  
| expression ADDOP expression  
| expression MULDIVANDOP expression  
| ADDOP expression 
| NOTOP expression
| variable 
| ID LPAREN expressions RPAREN 
| LENGTH LPAREN expression RPAREN
| NEW LPAREN expression RPAREN 
| constant 
| LPAREN expression RPAREN
| setlistexpression 
;

variable : ID 
| variable DOT ID 
| variable LBRACK expressions RBRACK 
| LISTFUNC LPAREN expression RPAREN 
;

expressions : expressions COMMA expression 
| expression
;

constant : ICONST 
| RCONST
| BCONST 
| CCONST 
;

setlistexpression : LBRACK expressions RBRACK
| LBRACK RBRACK
;

typedefs : TYPE type_defs SEMI 
| 
;

type_defs : type_defs SEMI ID EQU type_def 
| ID EQU type_def 
;

type_def : ARRAY LBRACK dims RBRACK OF typename 
| LIST OF typename 
| SET OF typename 
| RECORD fields END 
| limit DOTDOT limit 
;

dims : dims COMMA limits 
| limits 
;

limits : limit DOTDOT limit
| ID
;

limit : sign ICONST 
| CCONST 
| BCONST 
| ADDOP ID 
|
ID
;

sign : ADDOP
| 
;

typename : standard_type
| ID 
;

standard_type : INTEGER 
| REAL
| BOOLEAN
| CHAR 
;

fields : fields SEMI field 
| field 
;

field : identifiers COLON typename 
;

identifiers : identifiers COMMA ID 
| ID 
;

vardefs : VAR variable_defs SEMI 
| 
;

variable_defs : variable_defs SEMI identifiers COLON typename
| identifiers COLON typename
;

subprograms : subprograms subprogram SEMI 
| 
;

subprogram : sub_header  SEMI FORWARD
| sub_header SEMI declarations subprograms comp_statement
;

sub_header : FUNCTION ID formal_parameters COLON standard_type 
| FUNCTION ID formal_parameters COLON LIST 
| PROCEDURE ID formal_parameters 
| FUNCTION ID  
;

formal_parameters : LPAREN parameter_list RPAREN
| 
;

parameter_list : parameter_list SEMI pass identifiers COLON typename 
| pass identifiers COLON typename 
;

pass : VAR 
| 
;

comp_statement : T_BEGIN statements END 
;


assignment : variable ASSIGN expression 
| variable ASSIGN STRING 
;

case_statement : CASE expression OF cases case_tail END 
;

cases : cases SEMI single_case 
| single_case 
;

single_case : label_list COLON statement 
| 
;

label_list : label_list COMMA label 
| label 
;

label : sign constant 
| sign ID 
;


case_tail : SEMI OTHERWISE COLON statement 
| 
;

while_statement : WHILE expression DO statement 
;

for_statement : FOR ID ASSIGN iter_space DO statement 
;

iter_space : expression TO expression
| expression DOWNTO expression 
;

with_statement : WITH variable DO statement 
;

subprogram_call : ID
| ID LPAREN expressions RPAREN 
;

io_statement : READ LPAREN read_list RPAREN 
| WRITE LPAREN write_list RPAREN
;

read_list : read_list COMMA read_item 
| read_item 
;

read_item : variable 
;

write_list : write_list COMMA write_item 
| write_item 
;

write_item : expression 
| STRING
;
statements : statements SEMI statement 
| statement 
;

statement : matched
| unmatched
;

matched: assignment 
| matched_if_statement
| case_statement 
| while_statement 
| for_statement 
| with_statement 
| subprogram_call 
| io_statement 
| comp_statement
|
;


matched_if_statement: IF expression THEN matched ELSE matched
;

unmatched: IF expression THEN statement
| IF expression THEN matched ELSE unmatched
;

%%

I've put the if statement code at the bottom of the script.

Thank you for your time.

Was it helpful?

Solution

The problem is that there are a number of ways that a matched can expand to something that ends with a statement (and thus an unmatched). In particular, While_statement, for_statement and with_statement all end with a statement with no kind of terminator. So that statement might be an unmatched (if) and thus if it appears after a THEN, it triggers the ambiguity. For example:

IF expression THEN WHILE expression DO IF expression THEN statement ELSE statement

is ambiguous as the ELSE can bind to either IF. To fix it, you need to split all of the statements that might end in a statement into matched and unmatched versions. So you'll end up with

statement : matched
| unmatched
;

matched: assignment 
| matched_if_statement
| case_statement 
| matched_while_statement 
| matched_for_statement 
| matched_with_statement 
| subprogram_call  
| io_statement 
| comp_statement
|
;

unmatched: unmatched_if_statement 
| unmatched_while_statement 
| unmatched_for_statement 
| unmatched_with_statement 
;

matched_while_statement : WHILE expression DO matched
;

unmatched_while_statement : WHILE expression DO unmatched
;

etc

The unmatched rules all expand to some complex statement that ends with an unmatched if with no corresponding else, which cannot appear between a THEN and ELSE.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top