質問

At the moment I'm investigating the JSON ANTLR grammar from ANTLR project wiki: http://www.antlr.org/wiki/display/ANTLR3/JSON+Interpreter

String  :
    '"' ( EscapeSequence | ~('\u0000'..'\u001f' | '\\' | '\"' ) )* '"'
    ;

fragment EscapeSequence
        :   '\\' (UnicodeEscape |'b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
        ;

What I cannot get is why they negate \\ and \" in the String rule? It will be matched by a EscapeSequence anyway.

If we change it to ~('\u0000'..'\u001f') then it should mean the same.

What am I missing?

役に立ちましたか?

解決

This serves for disallowing single unescaped backslashes and single unescaped double quotes. Note that these appear as '\\' and '\"', because at least the former is disallowed in the grammar literal as well.

The EscapeSequence rule in contrast allows escaped backslashes and double quotes.

Omitting the exclusion of a single unescaped double quote would extend String tokenization to the last quote that can be found, however it should terminate at the first unescaped quote.

Omitting the exclusion of a single unescaped backslash would allow sequences beginning with a backslash, that are not supported EscapeSequences.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top