Question

I am having trouble in ply lex with int and double using the following program. DOUBLE_VAL is returned for 1 whereas i expected INT_VAL. On changing order of INT_VAL and DOUBLE_VAL functions, i get an error on decimal point. How can i resolve them ?

tokens = (
'VERSION',
'ID',
'INT_VAL',
'DOUBLE_VAL'
)

t_ignore = ' \t'
def t_VERSION(t):
    r'VERSION'
    return t

def t_DOUBLE_VAL(t):
    '[-+]?[0-9]+(\.[0-9]+)?([eE][-+]?[0-9]+)?'
    return t

def t_INT_VAL(t):
    r'[-+]?[0-9]+'
    return t

def t_ID(t):
    r'[a-zA-Z_]([_a-zA-Z0-9]*[a-zA-Z0-9])?'
    return t

def t_error(t):
    print "Error: ", t
    #exit(-1)

import ply.lex as lex
lexer = lex.lex()
lexer.input('VERSION 1 4.0')
while True:
    tok = lexer.token()
    if not tok: break
    print tok
Was it helpful?

Solution

Your grammar is matching integers with t_DOUBLE_VAL. Change t_DOUBLE_VAL's expression to only match if a decimal point is present:

def t_DOUBLE_VAL(t):
    '[-+]?[0-9]+(\.([0-9]+)?([eE][-+]?[0-9]+)?|[eE][-+]?[0-9]+)'
    return t
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top