Question

I need to parse the following string

/MyData.csv(MemberCount.sum,Production.sum[Salesperson="James Almond","Area="Europe "Area 1" (Germany)",Area="North America",Area="North America [Level A]"])

The first part is easy, however the clause within the brackets ([]) is giving me a bit of headache since it can contain double quotes as the example shows. (note the content within the brackets can change dynamically)

I expect the following output when parsing the last part of the string :

Salesperson James Almond Area Europe "Area 1" (Germany) Area North America Area North America [Level A]

I've been working with regex but can't seem to get it right. Hoping someone have the magic!

Was it helpful?

Solution 2

You may give a try to the Balancing Group Definitions:

string pattern = @"^[^\[\]]*" +
                @"(" +
                @"((?'Open'\[)[^\[\]]*)+" +
                @"((?'Close-Open'\])[^\[\]]*)+" +
                @")*" +
                @"(?(Open)(?!))$";

var results =
    Regex.Match(input, pattern)
    .Groups["Close"].Value
    .Split(new char[] { '=', ',' });

This outputs:

Salesperson
"James Almond"
Area
"Europe "Area 1" (Germany)"
Area
"North America"
Area
"North America [Level A]"

OTHER TIPS

You can use the fact that the 'true' closing double quotes are either followed by a comma or the closing square bracket:

Area="(?<Area>.*?)"(?=\]|,)

regex101 demo.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top