Question

I'm currently having a look at GNU Bison to parse program code (or actually to extend a program that uses Bison for doing that). I understand that Bison can only (or: best) handle LR(1) grammars, i.e. a special form of context-free grammars; and I actually also (believe to) understand the rules of context-free and LR(1) grammars.

However, somehow I'm lacking a good understanding of the notion of a LR(1) grammar. Assume SQL, for instance. SQL incorporates - I believe - a context-free grammar. But is it also a LR(1) grammar? How could I tell? And if yes, what would violate the LR(1) rules?

Was it helpful?

Solution

LR(1) means that you can choose proper rule to reduce by knowing all tokens that will be reduced plus one token after them. There are no problems with AND in boolean queries and in BETWEEN operation. The following grammar, for example is LL(1), and thus is LR(1) too:

expr ::= and_expr | between_expr | variable
and_expr ::= expr "and" expr
between_expr ::= "between" expr "and" expr
variable ::= x

I believe that the whole SQL grammar is even simpler than LR(1). Probably LR(0) or even LL(n).

OTHER TIPS

Some of my customers created SQL and DB2 parsers using my LALR(1) parser generator and used them successfully for many years. The grammars they sent me are LALR(1) (except for the shift-reduce conflicts which are resolved the way you would want). For the purists -- not LALR(1), but work fine in practice, no GLR or LR(1) needed. You don't even need the more powerful LR(1), AFAIK.

I think the best way to figure this out is to find an SQL grammar and a good LALR/LR(1) parser generator and see if you get a conflict report. As I remember an SQL grammar (a little out of date) that is LALR(1), is available in this download: http://lrstar.tech/downloads.html

LRSTAR is an LR(1) parser generator that will give you a conflict report. It's also LR(*) if you cannot resolve the conflicts.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top