Question

I am desperately looking for the Java grammar which can guarantees to be LR(1). I don't care which version of Java; I should be able to modify it to the version I want.

I'm reading the last chapter of Java™ Language Specification, Second Edition. And hell, the grammar in that chapter seems not be LR(1) at all. I have a feeling that the grammar is cyclic or somewhat weird, as I keep getting reduce-shift conflicts when I'm trying to generate the parse table from the grammar.

Can anyone gives some suggestions on this?

Note: Edited the title to L(AL)R since I'm okay to accept LALR grammar for Java as well.

Was it helpful?

Solution

Chapter 19 of the Java Language Specification, version 1.0 has an LALR(1) grammar for Java. This document can be found in many places by searching the web for "java language specification 1.0". This chapter disappeared in the second edition of the JLS, replaced by a grammar that (as you already point out) is not LR(1). Here are a couple of working links1 to the version 1.0 spec:

However, regarding this grammar, it's worth reading this analysis by Trevor Jim, who points out that the presented grammar is ambiguous. It's unclear whether it is possible to introduce disambiguation rules without destroying the context-free nature of the grammar. His conclusion is that whether Java is a context-free language is an open question (although he believes it is).

Note that the above applies to Java 1.0. I don't believe that more recent Java language versions are context-free. As pointed out in this thread, "Neither C# nor Java is context-free, because checking of whether a variable is used correctly and consistently throughout a particular scope is known not to be context-free". For instance, forward references (e.g., to methods) cannot be resolved in a single pass with a PDA.

1At least they were live when I posted this answer.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top