Question

string = "input-ports 6012, 6017, 6016"
m = re.match("input-ports(\s\d{4},?)(\s\d{4},?)(\s\d{4},?)", string)
print m.groups #=> (' 6012,', ' 6017,', ' 6016')

But when I want to use group repetition, it only return the last number

m = re.match("input-ports(\s\d{4},?)+", string)
print m.groups #=> (' 6016',)

anyone can tell me why is it?

Était-ce utile?

La solution

While traditional regex engines remember and return only the last match, some advanced libs provide captures property which holds all matches for the given group. There's a library called regex for python that does that, among other nice things:

import regex

string = "input-ports 6012, 6017, 6016"
m = regex.match("input-ports(?:\s(\d{4}),?)+", string)
print m.captures(1) # ['6012', '6017', '6016']

If you can't use this library, the only workaround is to use findall and replace the repetition with a single group with lookaheads. This is not always possible, but your example is easy:

import re

string = "input-ports 6012, 6017, 6016"
m = re.findall("(?<=\s)\d{4}(?=,|$)", string)
print m # ['6012', '6017', '6016'] 

Autres conseils

Note: A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data

on regex101

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top