質問

I've got a string like:

create Person +fname : String, +lname: String, -age:int;

Is there any possibility to split it by regex or ebnf? I mean all the things like [a-zA-Z0-9] (things we don't know) will be stored in array?

In other words, by using this regexp:

^create [a-zA-Z][a-zA-Z0-9]* [s|b]?[+|[-]|=][a-zA-Z][a-zA-Z0-9]*[ ]?:[ ]?[a-zA-Z][a-zA-Z0-9]*(, [s|b]?[+|[-]|=][a-zA-Z][a-zA-Z0-9]*[ ]?:[ ]?[a-zA-Z][a-zA-Z0-9]*)*;

I want to obtain array:

  • Person
  • +
  • fname
  • String
  • +
  • lname
  • String
  • -
  • age
  • int
役に立ちましたか?

解決

You can try to split it this way

String[] tokens = "create Person +fname : String, +lname: String, -age:int;"
        .split("[\\s:;,]+|(?<=[+\\-])");
        //split on set of characters containing spaces:;, OR after + or -. 
for (String s : tokens)
    System.out.println("=> " + s);

output:

=> create
=> Person
=> +
=> fname
=> String
=> +
=> lname
=> String
=> -
=> age
=> int

As you can see it will put create at start of your array so just start iterating from tokens[1].

You could try do add ^create\\s as part of splitting rule, but this will produce empty string at start of tokens array, so won't solve anything.

他のヒント

Regex is fine for lots of things, but sometimes you need a real lexer. JFlex is great. There's no tokenization task it can't handle. If you need to go a little further and create a parse tree, JavaCC or ANTLR are good choices.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top