Question

I'm trying to design an island grammar using Rascal MPL, but I ran into a problem:

When implementing an Island Grammar in SDF a very common approach is to define a "catch-all" water production using the {avoid} attribute. This prevents the parser from using this production if others are applicable. This allows to specify a default behaviour that can be overridden by other productions whitout generating abiguities. A very simple example of this would be:

context free syntax
    Chunk*         -> Input
    Water          -> Chunk
lexical syntax
    ~[\t\n\ ]+   -> Water {avoid}  // avoid the Water production

I tried reproducing this behaviour with Rascal MPL. My goal is to create an island grammar that gathers all conditional preprocessor directives inside a piece of C/C++ code and skips the rest of the input using Water productions.

layout LAYOUT = [\t\n\ ];
lexical WATER = ![\t\n\ ]+;

start syntax Program = Line*;       // program consists of lines

syntax Line = ConditionalDirective  // preprocessor directives
            > WATER;                // catch-all option

syntax ConditionalDirective = "#ifdef" 
                            | "#ifndef"
                            | "#if"
                            | "#elif";

I tried to create the {avoid} effect by giving the ConditionalDirective production a higher priority using the ">" operator, but this doesn't work apparently. The parse tree still contains ambiguities.

#ifdef asd

If I parse the above code for example, I get a parse tree that looks as follows:

ambiguous parse tree

As far as I can tell from the Rascal Documentation, using the "priority"-operator might not be the way to go in my case, but I don't see any other possibilities. I assume there is a way though, because the authors of rascal clearly state that every SDF grammar can be converted to a rascal grammar.

Is there a way to reproduce SDFs {avoid} functionality with rascal MPL? Or is it possible to filter the parse forest somehow, reapplying the priorities?

Was it helpful?

Solution

Short answer: avoid is in sdf2 a post parse filter. In rascal you can define these yourself, see https://github.com/cwi-swat/rascal/blob/master/src/org/rascalmpl/library/lang/sdf2/filters/PreferAvoid.rsc for an example which mimics sdf2 avoid behavior without ignoring injection chains and without counting. You can import it in your grammar and use @avoid and @prefer tags just like in sdf2, or write your own filters.

Caveat: avoid was generally not enough to define water behavior in sdf2 and it isn't in rascal either. The reason is that water can become longer than its alternative. Prefer and avoid can only choose between alternatives of equal length in terms of subsentence length. One surefire but slow way to deal with water in rascal is to count it in every alternative and choose derivations with less water.

Another issue with prefer and avoid was that the uses would start interfering, especially when they were counted. This can be avoided in rascal by specializing filters for specific nonterminals or even alternative rules.

Another option is to use the \ and ! disambigation operators. See the manual. However, All and all I believe the post parse filtering option is currently the best way to deal with island grammars because you control what is going on.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top