Question

i've got regex which was alright, but as it camed out doesn't work well in some situations

Keep eye on message preview cause message editor do some tricky things with "\"

[\[]?[\^%#\$\*@\-;].*?[\^%#\$\*@\-;][\]]

its task is to find pattern which in general looks like that

[ABA]

  • A - char from set ^,%,#,$,*,@,-,;
  • B - some text
  • [ and ] are included in pattern

is expected to find all occurences of this pattern in test string

Black fox [#sample1#] [%sample2%] - [#sample3#] eats blocks.

but instead of expected list of matches

  • "[#sample1#]"
  • "[%sample2%]"
  • "[#sample3#]"

I get this

  • "[#sample1#]"
  • "[%sample2%]"
  • "- [#sample3#]"

And it seems that this problem will occur also with other chars in set "A". So could somebody suggest changes to my regex to make it work as i need?

and less important thing, how to make my regex to exclude patterns which look like that

[ABC]

  • A - char from set ^,%,#,$,*,@,-,;
  • B - some text
  • C - char from set ^,%,#,$,*,@,-,; other than A
  • [ and ] are included in pattern

for example

[$sample1#] [%sample2@] [%sample3;]

thanks in advance

MTH

Was it helpful?

Solution

\[([%#$*@;^-]).+?\1\]

applied to text:

Black fox [#sample1#] [%sample2%] - [#sample3#] [%sample4;] eats blocks.

matches

  • [#sample1#]
  • [%sample2%]
  • [#sample3#]
  • but not [%sample4;]

EDIT

This works for me (Output as expected, regex accepted by C# as expected):

Regex re = new Regex(@"\[([%#$*@;^-]).+?\1\]");
string s = "Black fox [#sample1#] [%sample2%] - [#sample3#] [%sample4;] eats blocks.";

MatchCollection mc = re.Matches(s);
foreach (Match m in mc)
{
  Console.WriteLine(m.Value);
}

OTHER TIPS

Why the first "?" in "[[]?"

\[[\^%#\$\*@\-;].*?[\^%#\$\*@\-;]\]

would detect your different strings just fine

To be more precise:

\[([\^%#\$\*@\-;])([^\]]*?)(?=\1)([\^%#\$\*@\-;])\]

would detect [ABA]

\[([\^%#\$\*@\-;])([^\]]*?)(?!\1)([\^%#\$\*@\-;])\]

would detect [ABC]

You have an optional matching of the opening square bracket:

[\]]?

For the second part of you question (and to perhaps simplify) try this:

\[\%[^\%]+\%\]|\[\#[^\#]+\#\]|\[\$[^\$]+\$\]

In this case there is a sub pattern for each possible delimiter. The | character is "OR", so it will match if any of the 3 sub expressions match.

Each subexpression will:

  • Opening bracket
  • Special Char
  • Everything that is not a special char (1)
  • Special char
  • Closing backet

(1) may need to add extra exclusions like ']' or '[' so it doesn't accidently match across a large body of text like:

[%MyVar#] blah blah [$OtherVar%]

Rob

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top