Question

Let's say I have this grammar:

A: ε
 | B 'a'
B: ε
 | B 'b'

What is considered to be the closure of the item A: • B 'a'?
In other words, how do I deal with the epsilon transitions when figuring out closures?

Was it helpful?

Solution

This is pretty straightforward. Included in the closure of

    A = ... <dot> X ... ;

are all the rules

    X =   <dot> R1 R2 R3 ... ;

where first(R1) is not empty. For each (nonempty) token K in first(R1), you'll need to (transitively!) include

    R1 = <dot> k ... ;

etc. but presumably you are already clear on this.

You specific question is what happens if R1 can be empty? Then you also need to include

    X =   R1 <dot> R2 ... ;

Similarly for R2 being empty, if R1 can be empty, and similarly for Ri being empty if R1 .. Ri-1 can be empty. In extreme circumstances, all the Ri can be empty (lots of optional subclauses in your grammar), and you can end up including

    X =  R1 R2 ... Rn <dot> ;

Note that determining that first(R1) "can be empty" is itself a transitive closure question.

The GLR parser generator that I built for DMS precomputes first_can_be_empty using Warshall's algorithm and then uses that in the closure construction.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top