Regex, match a string between two characters that contains the same character to delimit

StackOverflow https://stackoverflow.com/questions/21968750

  •  15-10-2022
  •  | 
  •  

Question

I think that title may sound vague, but please do continue reading to understand what I mean.

Let's say I have this string:

TestBlock1 {    NestedBlock1.1    {
      Text = Text    } }
TestBlock2 {    NestedBlock2.1    {
      Text = Text    }    NestedBlock2.2    {
      Text = Text    } }

I want to be able to match the string by BlockName{...}. This is what I have tried:

[\w]+\s*{\s*[^.]+\s*}

The idea is to get the matches into an array.

string[] block;
block[0] = TestBlock1 { NestedBlock1.1 { Text = Text } }
block[1] = TestBlock2 { NestedBlock2.1 { Text = Text } NestedBlock2.2 { Text = Text } }

The problem is that it gets the whole string. Is it even remotely possible of getting a string between two characters that also contains the "delimiter" characters?

Was it helpful?

Solution

In .NET (which support recursive regexes), you can use

Regex regexObj = new Regex(
    @"\w+\s+        # Match identifier
    \{              # Match {
    (?>             # Then either match (possessively):
     (?:            # the following group which matches
      (?![{}])      # only if we're not before a { or }
      .             # any character
     )+             # once or more
    |               # or
     \{ (?<Depth>)  # { (and increase the braces counter)
    |               # or
     \} (?<-Depth>) # } (and decrease the braces counter).
    )*              # Repeat as needed.
    (?(Depth)(?!))  # Assert that the braces counter is at zero.
    \}              # Then match }.", 
    RegexOptions.IgnorePatternWhitespace | RegexOptions.Singleline);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top