Question

I'm developing a css parser but cannot figure out how to parse nested function calls like alpha(rgb(1, 2, 3), 0.5).

Here is my code:

# -*- coding: utf-8 -*-


from pyparsing import * #, Word, alphas, OneOrMore, countedArray, And, srange, hexnums,  Combine, cStyleComment

from string import punctuation
from app.utils.file_manager import loadCSSFile, writeFile
from pyparsing import OneOrMore, Optional, Word, quotedString, delimitedList,  Suppress
from __builtin__ import len

# divide tudo por espaços, tabulações e novas linhas
words = ZeroOrMore(cStyleComment | Word(printables))

digit = '0123456789'; underscore = '_'; hyphen =  '-'
hexDigit = 'abcdefABCDEF' + digit
letter = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
az_AZ_underscore = underscore + letter;  
az_AZ_09_underscore = az_AZ_underscore + digit; 
az_AZ_hyphen = hyphen + letter;
az_AZ_09_hiphen = az_AZ_hyphen + digit;
LPAR = Suppress('('); RPAR = Suppress(')')

# identifiers
identifier = Combine(Word(az_AZ_underscore) + Optional(Word(az_AZ_09_underscore)))
# identifiers
identifier_reserved = Combine(Word(az_AZ_hyphen) + Optional(Word(az_AZ_09_hiphen)))

# numbers
hexadecimal = Word(hexnums, min=1)
integer = Word(digit, min=1)
decimal = Combine('.' + integer | integer + Optional('.' + integer))


# value values
color_hex = Combine('#' +  hexadecimal ) 
at_identifier =  Combine('@' + identifier)
arg = at_identifier | color_hex | decimal | quotedString 




function_call = identifier + LPAR + Optional(delimitedList(arg)) + RPAR

value =  Group(color_hex | at_identifier | function_call)

print(value.parseString('a(b())'))

I whould like to do something like arg = at_identifier | color_hex | decimal | quotedString | function_call but it's not possible because the variable is function_call is not declared yet.

How could I parse nested function calls using Pyparsing?

Was it helpful?

Solution

You are really very close. To define a recursive grammar like this, you need to forward-declare the nested expression.

As you have already seen, this:

arg = at_identifier | color_hex | decimal | quotedString
function_call = identifier + LPAR + Optional(delimitedList(arg)) + RPAR

only parses function calls with args that are not themselves function calls.

To define this recursively, first define an empty placeholder expression, using Forward():

function_call = Forward()

We don't know yet what will go into this expression, but we do know that it will be a valid argument type. Since it has been declared now we can use it:

arg = at_identifier | color_hex | decimal | quotedString | function_call

Now that we have arg defined, we can define what will go into a function_call. Instead of using ordinary Python assignment using '=', we have to use an operator that will modify the function_call instead of redefining it. Pyparsing allows you to use <<= or << operators (the first is preferred):

function_call <<= identifier + LPAR + Optional(delimitedList(arg)) + RPAR

This is now sufficient to parse your given sample string, but you've lost visibility to the actual nesting structure. If you group your function call like this:

function_call <<= Group(identifier + LPAR + Group(Optional(delimitedList(arg))) + RPAR)

you will always get a predictable structure from a function call (a named and the parsed args), and the function call itself will be grouped.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top