Question

I have problem the same as here (nested function calls)

I want also limit functor to be only one of many given words (a, b, c)

so legal is:

a(dd, ee)
b(a(1)) 

but not:

aa(b(9))  - aa is invalid functor here

I can achieve that using one of:

functor1 = Literal('a') | Literal('b') | Literal('c')
functor2 = Word('a') | Word('b') | Word('c')
functor3 = Keyword('a') | Keyword('b') | Keyword('c')
functor4 = Combine(Keyword('a') | Keyword('b') | Keyword('c'))

first is easy, but rest is too ambiguous for me (especially when Word has param asKeyword, but its code do not use Keyword class and vice-versa).

please compare it.

are the OR-list work as Combine ?

Was it helpful?

Solution

Here is some test code for comparing your pyparsing expressions.

from pyparsing import *

functor1 = Literal('a') | Literal('b') | Literal('c')
functor2 = Word('a') | Word('b') | Word('c')
functor3 = Keyword('a') | Keyword('b') | Keyword('c')
functor4 = Combine(Keyword('a') | Keyword('b') | Keyword('c'))

functor1.setName("Literal('a') | Literal('b') | Literal('c')")
functor2.setName("Word('a') | Word('b') | Word('c')")
functor3.setName("Keyword('a') | Keyword('b') | Keyword('c')")
functor4.setName("Combine(Keyword('a') | Keyword('b') | Keyword('c'))")
functors = [functor1, functor2, functor3, functor4]

tests = "a b c aaa bbb ccc after before".split()
for func in functors:
    print func
    for t in tests:
        try:
            print t, ':', func.parseString(t)
        except ParseException as pe:
            print pe
    print

prints:

Literal('a') | Literal('b') | Literal('c')
a : ['a']
b : ['b']
c : ['c']
aaa : ['a']
bbb : ['b']
ccc : ['c']
after : ['a']
before : ['b']

Word('a') | Word('b') | Word('c')
a : ['a']
b : ['b']
c : ['c']
aaa : ['aaa']
bbb : ['bbb']
ccc : ['ccc']
after : ['a']
before : ['b']

Keyword('a') | Keyword('b') | Keyword('c')
a : ['a']
b : ['b']
c : ['c']
aaa : Expected "a" (at char 0), (line:1, col:1)
bbb : Expected "a" (at char 0), (line:1, col:1)
ccc : Expected "a" (at char 0), (line:1, col:1)
after : Expected "a" (at char 0), (line:1, col:1)
before : Expected "a" (at char 0), (line:1, col:1)

Combine(Keyword('a') | Keyword('b') | Keyword('c'))
a : ['a']
b : ['b']
c : ['c']
aaa : Expected "a" (at char 0), (line:1, col:1)
bbb : Expected "a" (at char 0), (line:1, col:1)
ccc : Expected "a" (at char 0), (line:1, col:1)
after : Expected "a" (at char 0), (line:1, col:1)
before : Expected "a" (at char 0), (line:1, col:1)

You should be able to make these observations:

  • Literal will match the given string, even if it is just the start of a larger string.

  • Word will match a word group of characters consisting of the letters in its constructor string.

  • Keyword will only match the given string if it is not part of a larger word (followed by space, or by a non-word character)

  • Combine does not really do anything in this example.

The purpose of Combine is to merge multiple matched tokens into a single string. For instance, if you defined a Social Security number as:

Word(nums,exact=3) + '-' + Word(nums,exact=2) + '-' + Word(nums,exact=4)

then parsing "555-66-7777" would give you

['555', '-', '66', '-', '7777']

Most likely you would like this as a single string, so combine the results by wrapping your parser expression in Combine:

Combine(Word(nums,exact=3) + '-' + Word(nums,exact=2) + '-' + Word(nums,exact=4))

['555-66-7777']
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top