Question

I apologize for my very basic question but, I'm really struggling here. I need to make a recursive descent parser. I'm working in Python and using PLY. My grammar follows:

< list > → (< sequence >) | ()

< sequence > → < listelement > , < sequence > | < listelement >

< listelement > → < list > | NUMBER

Would that look something like this? Am I way off? The end goal is to read a list into a data structure and then print it out.

def p_list(p)
    'list : "("sequence")" | "("")"'

def p_sequence(p)
    'sequence : list_el","sequence | list_el'

def p_list_el(p)
    'list_el : list | NUMBER'

If anyone was wondering what the full solution was I'll post it shortly.

Was it helpful?

Solution

This is how I'd do it:

tokens = ("LBRACKET", "RBRACKET",
          "INTEGER", "FLOAT", "COMMA") # So we can add other tokens
t_LBRACKET = r'\('
t_RBRACKET = r'\)'
t_INTEGER = r'\d+'
t_FLOAT = r'\d+\.\d+'
t_COMMA = r','

def p_list(p):
    """list : LBRACKET sequence RBRACKET
            | LBRACKET RBRACKET"""
    if len(p) == 4:
        p[0] = p[2]
    else:
        p[0] = None

def p_number(p):
    """number : INTEGER
              | FLOAT"""
    p[0] = p[1]

def p_sequence(p):
    """sequence : list_el COMMA sequence
                | list_el"""
    if len(p) == 4:
        p[0] = p[1] + p[3]
    else:
        p[0] = p[1]        

def p_list_el(p):
    """list_el : number
               | list"""
    p[0] = p[1]

Edit:
Quick explanation on the extra tokens: Everything in a script should eventually boil down to a token or character you've defined (So it's legal to add). By specifying them all as tokens, it's easier to read and work with.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top