문제

I want to split an input string into blocks starting with \begin{<word>} and ending with \end{<word>} where <word> can be "block", "vers" or "refr" and do addBlock() on each block. When trying this method on a string containing two of these blocks, m.groupCount() correctly returns 2, but m.find() returns false. How can this be? m.group() throws an exception.

private void addBlocks(String in) {
    Pattern p = Pattern.compile("\\\\begin\\{(vers|refr|block)\\}.*\\\\end\\{(vers|refr|block)\\}");
    Matcher m = p.matcher(in);
    while (m.find()) {
        addBlock(m.group());
    }
}

Edit: Yep, there were several things wrong there. Regex is a pain in the ass, it isn't very intuitive, and there is not that much sensible help online. Here is the code that finally worked:

private void addBlocks(String in) {
    Pattern p = Pattern.compile("\\\\begin\{(block|vers|refr)\\}(.|$)*?\\\\end\\{(block|vers|refr)\\}", Pattern.DOTALL);
    Matcher m = p.matcher(in);
    while (m.find()) {
         addBlock(m.group());
    }
}
도움이 되었습니까?

해결책

In general, your code works for me, at least for this test call:

addBlocks("foo bar \\begin{vers}bla\\end{vers}foo bar baz \\begin{refr}bla2\\end{refr} bla");

However, your regular expression will call addBlock() at most once because of the greedy * quantifier. You might rather want to use the *? quantifier:

Pattern p = Pattern.compile("\\\\begin\\{(vers|refr|block)\\}.*?\\\\end\\{(vers|refr|block)\\}");

With the *? quantifier you’ll get two matches for the above test call.

If there is no match on some input, then m.find() will correctly return false and m.group() will not be called (and thus won’t throw any IllegalStateException). Independent of the input string, m.groupCount() will always be 2 for your particular regular expression, as there are 2 capturing groups in the pattern.

다른 팁

This will never give more than one result because of the .* which eats every character that is preceded by the closing tag.

groupCount() doesn't return the number of matches, but the number of capturing groups. Also explained here: https://stackoverflow.com/a/2989061/2947592

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top