我有以下ANTLR语法:

grammar MyGrammar;

doc :   intro planet;
intro   :   'hi';
planet  :   'world';
MLCOMMENT 
    :   '/*' ( options {greedy=false;} : . )* '*/' { $channel = HIDDEN; };
WHITESPACE : ( 
    (' ' | '\t' | '\f')+
  |
    // handle newlines
    ( '\r\n'  // DOS/Windows
      | '\r'    // Macintosh
      | '\n'    // Unix
    )
    )
 { $channel = HIDDEN; };

在Antlrworks 1.2.3解释器中,输入 hi world,hi/**/worldhi /*A*/ world 工作,正如预期的那样。

但是,输入 hiworld, , 哪个 不应该 工作也被接受。我如何制作 hiworld 失败?我如何在“ HI”和“ World”之间至少强制一个空格(或评论)?

请注意,在此示例中,我仅使用MLComment和Whitespace来简化,但是将支持其他类型的评论。

有帮助吗?

解决方案

You need to create a general ID token. Since the lexer builds the longest token it can, it would see the input "hiworld" as a single word since it's longer than "hi" or "world" by themselves. Such a rule might look like:

ID : ('a'..'z' | 'A'..'Z')+;

As an example, that's exactly how parsers for programming languages separate the "do" keyword from "double" (keyword type, starts with 'do') or "done" (variable name).

其他提示

One way to make the string hiworld fail is to use a validating semantic predicate that is guaranteed to fail, as follows:

doc:      intro planet;
failure : 'hiworld' { false }?;
intro   : 'hi';
planet  : 'world';
// rest of grammar omitted
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top