Domanda

This is a snippet of a DSL that I am attempting to parse using pyparsing

I have a string of the format <keyword> 02 01 30 03 40 20 10
Where
02 is the number of strings
01 is the length of string1 (in bytes)
30 is the string1 itself
03 is the length of string2 (in bytes)
40 20 10 is the string2

How do I tokenize this string using pyparsing?

È stato utile?

Soluzione

So it's a countedArray of countedArray's? Did you try:

from pyparsing import Word,nums,alphas,countedArray

test = "key 02 01 30 03 40 20 10"

integer = Word(nums)

# each string is a countedArray of integers, and the data is a counted array
# of those, so...
lineExpr = Word(alphas)("keyword") + countedArray(countedArray(integer))("data")

# parse the test string, showing the keyworod, and list of lists for the data
print lineExpr.parseString(test).asList()

Gives:

['key', [['30'], ['40', '20', '10']]]

The named results also let you get at the parsed bits by name:

result = lineExpr.parseString(test)
print result.keyword
print result.data

Gives:

key
[[['30'], ['40', '20', '10']]]
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top