How do I match a Regular Expression in a Happy parser?
-
29-04-2021 - |
Pregunta
I'm writing a JavaScript parser with Happy and I need to match a regular expression. I don't want to fully parse the regex, just store it as a string.
The relevant part of my AST looks like this:
data PrimaryExpr
-- | Literal integer
= ExpLitInt Integer
-- | Literal strings
| ExpLitStr String
-- | Identifier
| ExpId String
-- | Bracketed expression
| ExpBrackExp Expression
-- | This (current object)
| ExpThis
-- | Regular Expression
| ExpRegex String
-- | Arrays
| ExpArray ArrayLit
-- | Objects
| ExpObject [(PropName, Assignment)]
deriving Show
This is the relevant Happy code:
primaryExpr :: { PrimaryExpr }
: LITINT { ExpLitInt $1 }
| LITSTR { ExpLitStr $1 }
| ID { ExpId $1 }
| THIS { ExpThis }
| regex { ExpRegex $1 }
| arrayLit { ExpArray $1 }
| objectLit { ExpObject $1 }
| '(' expression ')' { ExpBrackExp $2 }
My question is, how should I define my regex
non-terminal? Is this kind of structure right?
regex :: { String }
: '/' whatHere? '/' { $2 }
Solución
You should define regex as a terminal that is recognized by the lexer (i.e. LITREGEX).
primaryExpr :: { PrimaryExpr }
: LITINT { ExpLitInt $1 }
| LITSTR { ExpLitStr $1 }
| LITREGEX { ExpRegex $1 }
| ID { ExpId $1 }
| THIS { ExpThis }
| arrayLit { ExpArray $1 }
| objectLit { ExpObject $1 }
| '(' expression ')' { ExpBrackExp $2 }
Otros consejos
To answer the question in the comment, need a bit more room.
Something like (spaced out and commented):
/ forward slash
( \\. either: an escaped character
| [^\[/\\] anything which isn't / or [ or \
| \[ a character class containing:
[^\]]* anything which isn't ] any number of times
\]
)* any number of times
/ forward slash
Condensed:
/(\\.|[^\[/\\]|\[[^\]]*\])*/
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow