문제

I have a simple file in rascal for specifying a toy grammar

module temp

import IO;

import ParseTree;

layout LAYOUT = [\t-\n\r\ ]*;

start syntax Simple 
  =  A B ;

syntax A = "Hello"+ ("joe" "pok")* ;
syntax A= "Hi";
syntax B = "world"*|"wembly";
syntax B =    C | C C*   ;


public void main () {
println("hello");
iprint(parse(#start[Simple], "Hello Hello world world world"));
}

This works fine, however, the problem is that I didn't want to write

syntax B =    C | C C*   ;

I wanted to write

syntax B =  (  C | C C*  )? 

but it was rejected as a parse error by rascal -even though all of

syntax B =  (  C  C C*  )? ;

syntax B =  (  C |  C*  )? ;

syntax B =    C | C C*   ;

are accepted fine. Can anyone explain to me what I'm doing wrong?

도움이 되었습니까?

해결책

The sequence symbol (nested sequence) always requires brackets in rascal. The meta notation is defined as

syntax Sym = sequence: "(" Sym+ ")" | opt: Sym "?" | alternative: "(" Sym "|" {Sym "|"}+ ")" | ... ;

So, in your example you should have written:

syntax B = (C | (C C*))?;

What is perhaps confusing is that Rascal uses the | sign twice. Once for separating top-level alternatives, once for nested alternative:

syntax X = "a" | "b"; // top-level
syntax Y = ("c" | "d"); // nested, will internally generate a new rule: 
syntax ("c" | "d") = "c" | "d";

Finally, normal alternatives have sequences without brackets, as in:

syntax B 
  = C
  | C C*
  ;
// or less abstractly:
syntax Exp = left Exp "*" Exp
           > left Exp "+" Exp
           ;

BTW, we generally avoid the use of too many nested regular expressions because they are so anonymous and therefore make interpreting parse trees harder. The best usage of regular expressions is for expressing lexical syntax where we are not so much interested in the internal structure anyhow.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top