Question

I have the following lexer: ID : [a-z][a-z0-9_]*;

It works well except matching identifiers like 1a or 222z222, but not all numbers like 1 or 999.

So, what should I do to solve the problem?

Was it helpful?

Solution

Your lexer is [a-z][a-z0-9_]*. So this will match which starts with lowercase letter, followed by zero or more lowercase letters or digits or underscore

if you want the identifiers to start with either lowercase or digit but not all digits, then try

[a-z][a-z0-9_]*;|[0-9]+[a-z_][a-z0-9_]*;  // Updated

So it got two parts

  • [a-z][a-z0-9_]*; : matched which starts with lowercase
  • [0-9]+[a-z_][a-z0-9_]*; : if it starts with digits, the after one or more digits, it expects one letter or underscore, followed by zero or more letter, digit or underscore.

You can write the same thing as ([a-z]|[0-9]+[a-z_])[a-z0-9_]*;.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top