Cannot parse correctly this file with pyparsing

Question

You really are pretty close - congrats, indented parsers are not the easiest to write with pyparsing.

Look at the commented changes. Those marked with 'A' are changes to fix your two stated problems. Those marked with 'B' add Dict constructs so that you can access the parsed data as a nested structure using the names in the config.

The biggest culprit is that indentedBlock does some extra Group'ing for you, which gets in the way of Dict's name-value associations. Using ungroup to peel that away lets Dict see the underlying pairs.

Best of luck with pyparsing!

import pprint
import pyparsing
NEWLINE = pyparsing.LineEnd().suppress()
VALID_CHARACTERS = pyparsing.srange("[a-zA-Z0-9_\-\.]")
COLON = pyparsing.Suppress(pyparsing.Literal(":"))
HYPHEN = pyparsing.Suppress(pyparsing.Literal("-"))
XX = pyparsing.Literal("XX")

list_item = HYPHEN + pyparsing.Combine(XX + pyparsing.Word(VALID_CHARACTERS))
list_of_items = pyparsing.Group(pyparsing.OneOrMore(list_item))

key = pyparsing.Word(VALID_CHARACTERS) + COLON
pair_value = pyparsing.Word(VALID_CHARACTERS) + NEWLINE
value = (pair_value | list_of_items)

#~ A: pair = pyparsing.Group(key + value)
pair = (key + value)

indentStack = [1]

section = pyparsing.Forward()
section_name = pyparsing.Word(VALID_CHARACTERS) + COLON
#~ A: section_value = pyparsing.OneOrMore(pair | section)
section_value = (pair | section)

#~ B: section_content = pyparsing.indentedBlock(section_value, indentStack, True)
section_content = pyparsing.Dict(pyparsing.ungroup(pyparsing.indentedBlock(section_value, indentStack, True)))

#~ A: section << Group(section_name + section_content)
section << (section_name + section_content)

#~ B: parser = pyparsing.OneOrMore(section)
parser = pyparsing.Dict(pyparsing.OneOrMore(pyparsing.Group(section)))

Now instead of pprint(result.asList()) you can write:

print (result.dump())

to show the Dict hierarchy:

[['sectionOne', ['list', ['XXitem', 'XXanotherItem']], ... etc. ...
- sectionOne: [['list', ['XXitem', 'XXanotherItem']], ... etc. ...
  - key1: value1
  - list: ['XXitem', 'XXanotherItem']
  - mods: ['XXone', 'XXtwo']
  - product: milk
  - release: now
  - subSection: [['skey', 'sval'], ['slist', ['XXitem']]]
    - skey: sval
    - slist: ['XXitem']
  - version: last
- sectionTwo: [['base', 'base-0.1'], ['config', 'config-7.0-7']]
  - base: base-0.1
  - config: config-7.0-7

allowing you to write statements like:

print (result.sectionTwo.base)