Question

I have the following regexp:

/xxx ([a-z]+)(?:, ([a-z]+))* xxx/

I want to capture all colors in the following test string:

xxx red, blue, pink, purple xxx

(now only red and purple get captured)

open this url and see the matched groups: http://www.regex101.com/r/oZ2cH4

I have read the following http://www.regular-expressions.info/captureall.html but the trick didn't work

(or maybe I did it wrong)

how can I resolve this?

thank you in advance

Was it helpful?

Solution

You probably want to return a matching group based on a previous pattern matches:

$word = '[a-z]+';
$sep  = '[, ]+';

$words = $captures("~($word)(?:{$sep})?~");
$of    = $captures("~xxx ({$word}(?:{$sep}{$word})*) xxx~");

print_r($words($of($subject)));

Output:

Array
(
    [0] => red
    [1] => blue
    [2] => pink
    [3] => purple
)

Whereas $captures is a function that return a pre-configured preg_match_all call allowing to process not only a string as subject but anything foreach can operate on:

$captures = function ($pattern, $group = 1) {
    return function ($subject) use ($pattern, $group) {
        if (is_string($subject)) {
            $subject = (array)$subject;
        }
        $captures = [];
        foreach ($subject as $step) {
            preg_match_all($pattern, $step, $matches);
            $captures = array_merge($captures, $matches[$group]);
        }
        return $captures;
    };
};

By default and as used in the example above, it returns the first group (1), but this can be configured.

This allows to first match the outer pattern ($of) and then on each of those matches the inner pattern ($words). The example in full:

$subject = '/xxx red, blue, pink, purple xxx/';

$captures = function ($pattern, $group = 1) {
    return function ($subject) use ($pattern, $group) {
        if (is_string($subject)) {
            $subject = (array)$subject;
        }
        $captures = [];
        foreach ($subject as $step) {
            preg_match_all($pattern, $step, $matches);
            $captures = array_merge($captures, $matches[$group]);
        }
        return $captures;
    };
};

$word = '[a-z]+';
$sep  = '[, ]+';
$seq  = "";

$words = $captures("~($word)(?:{$sep})?~");
$of    = $captures("~xxx ({$word}(?:{$sep}{$word})*) xxx~");

print_r($words($of($subject)));

See the live-demo.

OTHER TIPS

The tutorial "Repeating a Capturing Group vs. Capturing a Repeated Group" (by regular-expressions.info) describes how you would capture all of the content "red, blue, pink, purple" in a single capture. The pattern it would suggest is

/xxx ((?:[a-z]+(?:, )?)+) xxx/

but if this were really what you were trying to accomplish, you may as well use the simpler expression

/xxx ([a-z, ]*) xxx/

I suspect what you actually want is to capture each color individually. This might be best accomplished by capturing the entire list once, then parsing that captured content.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top