문제

I'm trying to capture quoted strings without the quotes. I have this terminal

%token <string> STRING

and this production

constant:
    | QUOTE STRING QUOTE { String($2) }

along with these lexer rules

| '\''       { QUOTE }
| [^ '\'']*  { STRING (lexeme lexbuf) } //final regex before eof

It seems to be interpreting everything leading up to a QUOTE as a single lexeme, which doesn't parse. So maybe my problem is elsewhere in the grammar--not sure. Am I going about this the right way? It was parsing fine before I tried to exclude quotes from strings.

Update

I think there may be some ambiguity with the following lexer rules

let name = alpha (alpha | digit | '_')*
let identifier = name ('.' name)*

The following rule is prior to STRING

| identifier    { ID (lexeme lexbuf) }

Is there any way to disambiguate these without including quotes in the STRING regex?

도움이 되었습니까?

해결책

It's pretty normal to do semantic analysis in the lexer for constants like strings and numeric literals, so you might consider a lex rule for your string constants like

| '\'' [^ '\'']* '\'' 
    { STRING (let s = lexeme lexbuf in s.Substring(1, s.Length - 2)) }

다른 팁

You can use lexeme with quotes, but trim quotes in parser

Lexer:

let constant       = ("'" ([^ '\''])* "'")
...
| constant         { STRING(lexeme lexbuf) } 

Parser:

%token <string> STRING

...
constant:
    | STRING { ($1).Trim([|'''|]) }

Or if you want to extract quotes from string:

Lexer:

let name = alpha (alpha | digit | '_')*
let identifier = name ('.' name)*
...

| '\''       { QUOTE }
| identifier { ID (lexeme lexbuf) }
| _          { STRING (lexeme lexbuf) } 

identifier will take away symbols from STRING, so your lexeme stream can be like: QUOTE ID STRING ID .. QUOTE, and you have to handle this in parser:

Parser:

constant:
     | QUOTE content QUOTE     { String($2) }

content:
     | ID content      { $1+$2 }
     | STRING content  { $1+$2 }
     | ID              { $1 }
     | STRING          { $1 }

I had a similar problem. I capture them in the "lexic.l" file using states. Here my autoanswer

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top