(from the pyparsing wiki)
You can get the string parsed using infixNotation
(formerly known as operatorPrecedence
). Assuming that ',' has precedence over '|', this would look like:
variable = oneOf(list(alphas.lower()))
expr = infixNotation(variable,
[
(',', 2, opAssoc.LEFT),
('|', 2, opAssoc.LEFT),
])
Converting your test cases to a little testing framework, we can at least test the parsing part:
tests = [
("(a, b, c)", ["abc"]),
("(a, b | c)", ["ab", "c"]),
("((a, b) | c)", ["ab", "c"]),
("(a, (b | c))", ["ab", "ac"]),
("(a, b, (c | (d, e)), f)", ["abcf","abdef"]),
("(a, b, (c | (d, e) | f), g)", ["abcg", "abdeg", "abfg"]),
("(a, b, c, ((d, (e | f)) | (g, h)), i)",
["abcdei", "abcdfi", "abcghi"]),
("((a | b), c)", ["ac", "bc"]),
]
for test,expected in tests:
# if your expected values *must* be lists and not strings, then
# add this line
# expected = [list(ex) for ex in expected]
result = expr.parseString(test)
print result[0].asList()
which will give you something like this:
['a', ',', 'b', ',', 'c']
[['a', ',', 'b'], '|', 'c']
[['a', ',', 'b'], '|', 'c']
['a', ',', ['b', '|', 'c']]
['a', ',', 'b', ',', ['c', '|', ['d', ',', 'e']], ',', 'f']
['a', ',', 'b', ',', ['c', '|', ['d', ',', 'e'], '|', 'f'], ',', 'g']
['a', ',', 'b', ',', 'c', ',', [['d', ',', ['e', '|', 'f']], '|', ['g', ',', 'h']], ',', 'i']
[['a', '|', 'b'], ',', 'c']
That is the easy part, parsing the string and getting the operator precedence reflected in the resulting structure. Now if you follow the example from the regex inverter, you will need to attach objects to each parsed bit, something like this:
class ParsedItem(object):
def __init__(self, tokens):
self.tokens = tokens[0]
class Var(ParsedItem):
""" TBD """
class BinaryOpn(ParsedItem):
def __init__(self, tokens):
self.tokens = tokens[0][::2]
class Sequence(BinaryOpn):
""" TBD """
class Alternation(BinaryOpn):
""" TBD """
variable = oneOf(list(alphas.lower())).setParseAction(Var)
expr = infixNotation(variable,
[
(',', 2, opAssoc.LEFT, Sequence),
('|', 2, opAssoc.LEFT, Alternation),
])
Now you will have to implement the bodies of Var
, Sequence
, and Alternation
. You will not get a list of values directly back from pyparsing, instead you will get one of these object types back. Then, instead of calling asList()
as I did in the sample above, you'll call something like generate
or makeGenerator
to get a generator from that object. Then you'll invoke that generator to have the objects generate all the different results for you.
I leave the rest as an exercise for you.
-- Paul