Why does Python regex with literal words not match, but \w+ does? [closed]

https://stackoverflow.com/questions/22996085

01-07-2023
|

Question

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.

This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.

Closed 9 years ago.

Improve this question

I tried to write a regular expression that looks like it would work, but I had to replace some of the literal strings with word patterns and I want to understand why.

Here's the example:

import re

text = "   1    p2       2.26347691E+12    optvl    9.05369210E+04    ctha     6.00000000E+01"

p1 = re.compile(r"\s+(\d+)\s+p2\s+([\d\.\+\-E]+)\s+optv1\s+([\d\.\+\-E]+)\s+ctha\s+([\d\.\+\-E]+)")
m1 = p1.findall(text)
print m1

p2 = re.compile(r"\s+(\d+)\s+p2\s+([\d\.\+\-E]+)\s+\w+\s+([\d\.\+\-E]+)\s+\w+\s+([\d\.\+\-E]+)")
m2 = p2.findall(text)
print m2

Here's the output:

[]
[('1', '2.26347691E+12', '9.05369210E+04', '6.00000000E+01')]

Thanks for any insight!

Edit: yep, it's a typo - the old l vs 1

Solution

There is a typo in the first version with words, should be l instead of 1:

p1 = re.compile(r"\s+(\d+)\s+p2\s+([\d\.\+\-E]+)\s+optvl\s+([\d\.\+\-E]+)\s+ctha\s+([\d\.\+\-E]+)")
                                                       ^

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow