pyparsing - defining keywords - compare Literal, Word, Keyword and Combine

Question

Here is some test code for comparing your pyparsing expressions.

from pyparsing import *

functor1 = Literal('a') | Literal('b') | Literal('c')
functor2 = Word('a') | Word('b') | Word('c')
functor3 = Keyword('a') | Keyword('b') | Keyword('c')
functor4 = Combine(Keyword('a') | Keyword('b') | Keyword('c'))

functor1.setName("Literal('a') | Literal('b') | Literal('c')")
functor2.setName("Word('a') | Word('b') | Word('c')")
functor3.setName("Keyword('a') | Keyword('b') | Keyword('c')")
functor4.setName("Combine(Keyword('a') | Keyword('b') | Keyword('c'))")
functors = [functor1, functor2, functor3, functor4]

tests = "a b c aaa bbb ccc after before".split()
for func in functors:
    print func
    for t in tests:
        try:
            print t, ':', func.parseString(t)
        except ParseException as pe:
            print pe
    print

prints:

Literal('a') | Literal('b') | Literal('c')
a : ['a']
b : ['b']
c : ['c']
aaa : ['a']
bbb : ['b']
ccc : ['c']
after : ['a']
before : ['b']

Word('a') | Word('b') | Word('c')
a : ['a']
b : ['b']
c : ['c']
aaa : ['aaa']
bbb : ['bbb']
ccc : ['ccc']
after : ['a']
before : ['b']

Keyword('a') | Keyword('b') | Keyword('c')
a : ['a']
b : ['b']
c : ['c']
aaa : Expected "a" (at char 0), (line:1, col:1)
bbb : Expected "a" (at char 0), (line:1, col:1)
ccc : Expected "a" (at char 0), (line:1, col:1)
after : Expected "a" (at char 0), (line:1, col:1)
before : Expected "a" (at char 0), (line:1, col:1)

Combine(Keyword('a') | Keyword('b') | Keyword('c'))
a : ['a']
b : ['b']
c : ['c']
aaa : Expected "a" (at char 0), (line:1, col:1)
bbb : Expected "a" (at char 0), (line:1, col:1)
ccc : Expected "a" (at char 0), (line:1, col:1)
after : Expected "a" (at char 0), (line:1, col:1)
before : Expected "a" (at char 0), (line:1, col:1)

You should be able to make these observations:

Literal will match the given string, even if it is just the start of a larger string.
Word will match a word group of characters consisting of the letters in its constructor string.
Keyword will only match the given string if it is not part of a larger word (followed by space, or by a non-word character)
Combine does not really do anything in this example.

The purpose of Combine is to merge multiple matched tokens into a single string. For instance, if you defined a Social Security number as:

Word(nums,exact=3) + '-' + Word(nums,exact=2) + '-' + Word(nums,exact=4)

then parsing "555-66-7777" would give you

['555', '-', '66', '-', '7777']

Most likely you would like this as a single string, so combine the results by wrapping your parser expression in Combine:

Combine(Word(nums,exact=3) + '-' + Word(nums,exact=2) + '-' + Word(nums,exact=4))

['555-66-7777']