How using literal to separate one or more words?

https://stackoverflow.com/questions/20313689

06-08-2022
|

Domanda

I need to make a rule to evaluate the following expressions.

Sao Paulo to Rio de Janeiro >> ["Sao Paulo", "Rio de Janeiro"]

Rio de Janeiro to Brasilia >> ["Rio de Janeiro", "Brasilia"]

Brasilia to Sao Jose dos Pinhais >> ["Brasilia", "Sao Jose dos Pinhais"]

I tried unsuccessfully

from pyparsing import *

source = OneOrMore(Word(alphas))
target = OneOrMore(Word(alphas))
expression = source + Literal('to') + target

# input string
phase = "Sao Paulo to Rio de Janeiro"

# parse input string
print phase, "->", expression.parseString( phase )

Soluzione

The problem is that OneOrMore(Word(alphas)) is being greedy since 'to' matches Word(alphas), you need to include a lookahead for it.

You can ungreedy it by adding a NOT (aka ~) operator that looks for 'to' specifically and will stop matching. The problem then is that whatever the word before 'to' isn't part of the expression. So you have to tack on one more Word(alphas) at the end.

to = Suppress(Literal('to'))
src = Group(ZeroOrMore(Word(alphas) + ~FollowedBy(to)) + Word(alphas)).setResultsName('src')
dst = OneOrMore(Word(alphas)).setResultsName('dst')
exp = src + to + dest
phase = "Sao Paulo to Rio de Janeiro"

# parse input string
s = exp.parseString( phase )

And when you run it you get the results you wanted:

['Sao', 'Paulo']
['Rio', 'de', 'Janeiro']

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow