Question

I am having some trouble cooking up a regex that produces this result:

Mike1, misha1,2, miguel1,2,3,4,5,6,7,18, and Michea2,3

How does one step back in regex and discard the last match? That is I need a comma before a space to not match. This what I came up with...

\d+(,|\r)

Mike1, misha1,2, miguel1,2,3,4,5,6,7,18, and Micheal2,3

Was it helpful?

Solution

The regex feature you're asking about is called a positive lookbehind. But in your case, I don't think you need it. Try this:

\d+(?:,\d+)*

In your example, this will match the comma delimited lists of numbers and exclude the names and trailing commas and whitespace.

Here is a short bit of test code written in PHP that verifies it on your input:

<?php
$input = "Mike1, misha1,2, miguel1,2,3,4,5,6,7,18, and Micheal2,3";
$matches = array();
preg_match_all('/\d+(?:,\d+)*/', $input, $matches);
print_r($matches[0]);
?>

outputs:

Array
(
    [0] => 1
    [1] => 1,2
    [2] => 1,2,3,4,5,6,7,18
    [3] => 2,3
)

OTHER TIPS

I believe \d+,(?!\s) will do what you want. The ?! is a negative lookahead, which only matches if what follows the ?! does not appear at this position in the search string.

>>> re.findall(r'\d+,(?!\s)', 'Mike1, misha1,2, miguel1,2,3,4,5,6,7,18, and Michea2,3')
['1,', '1,', '2,', '3,', '4,', '5,', '6,', '7,', '2,']

Or if you want to match the comma-separated list of numbers excluding the final comma use \d+(?:,\d+)*.

>>> re.findall(r'\d+(?:,\d+)*', 'Mike1, misha1,2, miguel1,2,3,4,5,6,7,18, and Michea2,3')
['1', '1,2', '1,2,3,4,5,6,7,18', '2,3']
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top