Question

This regular expression (the extra \ for each \ is because it is in java code

\\*\\{[\\w\\W]*\\}\\*

The actual would be

\*\{[\w\W]*\}\*

but this expression is not greedy on the }* side. I am trying to match everything between the { and the } so if I have

*{ some comment }* hi there and some stuff *{ comment 2 }* and soe more stuff

should end up with

hi there and some stuff and soe more stuff

but instead it is not greedy enough. There is info on greedy here and I thought I want X1 where that would be

\\*\\{[\\w\\W]*\\}1\\* or \\*\\{[\\w\\W]*\\}{1}\\*

but that doesn't work. How to use their X{n} thing to force greedy here in this example?

Was it helpful?

Solution

ReplaceAll with your regex but add ? so the [\w\W] will not greedy, like this:

String yourString = "*{ some comment }* hi there and some stuff *{ comment 2 }* and soe more stuff";
yourString.replaceAll("\\*\\{[\\w\\W]*?\\}\\*","");

then you will get that hi there and some stuff and soe more stuff

OTHER TIPS

Try something like this:

\*\{((?!\}\*).)*\}\*

Or in Java form:

\\*\\{((?!\\}\\*).)*\\}\\*

It uses a negative lookahead to distinguish the }* closing tag from } alone. That's the ((?!\}\*).)* part.

Edit: Here's a (Java) version that allows newlines. You can also use Pattern.DOTALL to make . include newlines, so the above patterns will work.

\\*\\{((?!\\}\\*)[\\s\\S])*\\}\\*

Note that this will not be recursive. You can't have something like *{ foo *{ bar }* }* and have the whole thing treated as a comment. That would make this a context-free grammar, and trying to parse CFGs is among the most famous no-nos with regex.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top