Question

I tried to write a regular expression that looks like it would work, but I had to replace some of the literal strings with word patterns and I want to understand why.

Here's the example:

import re

text = "   1    p2       2.26347691E+12    optvl    9.05369210E+04    ctha     6.00000000E+01"

p1 = re.compile(r"\s+(\d+)\s+p2\s+([\d\.\+\-E]+)\s+optv1\s+([\d\.\+\-E]+)\s+ctha\s+([\d\.\+\-E]+)")
m1 = p1.findall(text)
print m1

p2 = re.compile(r"\s+(\d+)\s+p2\s+([\d\.\+\-E]+)\s+\w+\s+([\d\.\+\-E]+)\s+\w+\s+([\d\.\+\-E]+)")
m2 = p2.findall(text)
print m2

Here's the output:

[]
[('1', '2.26347691E+12', '9.05369210E+04', '6.00000000E+01')]

Thanks for any insight!

Edit: yep, it's a typo - the old l vs 1

Was it helpful?

Solution

There is a typo in the first version with words, should be l instead of 1:

p1 = re.compile(r"\s+(\d+)\s+p2\s+([\d\.\+\-E]+)\s+optvl\s+([\d\.\+\-E]+)\s+ctha\s+([\d\.\+\-E]+)")
                                                       ^
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top