سؤال

We experienced some problems with this regex.

/\(\((((?>[^\(\(\)\)]+)|(?R))*)\)\)/x

It is used to "spin" texts. When we have a sentence like "((We ((love | like)) this shirt(size xl)))", then he cannot execute this right. Because of the three parentheses at the end of the sentence (because the first of the last three parentheses at the end belongs to the text itself).

How it needs to work: The regex needs to get the first 2 parentheses if it has 3 or more at the beginning, end the last 2 when it has 3 or more parentheses at the end. Is that possible???

keep in mind that it works now quite well on multilevel, so something like "((this((shirt|sweater))))" works well (see the 4 parentheses at the end?). So it only goes wrong when parentheses which belong IN the text, start right behind the starting parentheses for spinning OR end right before the ending parentheses.

هل كانت مفيدة؟

المحلول

Well, first of all, you don't need to escape the parentheses within the character classes and it's no more useful to put more than once the same character in a character class; thus your regex can become like this without any change in functionality:

\(\((((?>[^()]+)|(?R))*)\)\)

I'm not entirely sure why you're using the atomic group either; I could be wrong (and do correct me if I am), but I don't find any difference than using a standard non-capture group.

This said, you can now allow nested single bracket strings in the regex by including another possibility in the regex's inner capture:

\(\(((?:[^()]|((?R))|(\((?:[^()]|(?3))*\)))*)\)\)
    1         2      3

Capture groups:
1 - gets the first big match between the outermost (( ... ))
2 - gets the inner matches and any more nested (( ... ))
3 - gets the content between single parens so that recursion is allowed in this group as well. If you don't need it, just ignore it in the array, it's there just to allow nested single parens.

regex101 demo

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top