Question

I bumped into the problem while playing around in Python: when I create a random string, let's say "test 1981", the following Python call returns with an empty string.

>>> re.search('\d?', "test 1981").group()
''

I was wondering why this is. I was reading through some other posts, and it seems that it has to do with greedy vs. non-greedy operators. Is it that the '?' checks to see if the first value is a digit, and if it's not, it takes the easier, quicker path and just outputs nothing? Any clarification would help. Thanks!

Was it helpful?

Solution

Your pattern matches a digit or the empty string. It starts at the first character and tries to match a digit, what it is doing next is trying to match the alternative, means the empty string, voilà a match is found before the first character.

I think you expected it to move on and try to match on the next character, but that is not done, first it tries to match what the quantifier allows on the first position. And that is 0 or one digit.

The use of the optional quantifier makes only sense in combination with a required part, say you want a digit followed by an optional one:

>>> re.search('\d\d?', "test 1981").group()
'19'

Otherwise your pattern is always true.

OTHER TIPS

Regex

\d?

simply means that it should optionally (?) match single digit (\d).

If you use something like this, it will work as you expect (match single digit anywhere in the string):

\d

re.search('\d?', "test 1981").group() greedily matches the first match of the pattern (0 or 1 digits) it can find. In this case that's zero digits. Note that re.search('\d?', "1981 test").group() actually matches the string '1' at the beginning of the string. What you're probably looking for here is re.search('\d+', "test 1981").group(), which finds the whole string 1981 no matter where it is.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top