how to select inner matches in lex
-
03-07-2019 - |
Question
am new to lex and I wanna take all the matches specific to a regular expression
for example in the following text :
/* text text
text
text
text */
text text
/* text text text text text text
text text */
i wanna choose the two matches between /* and */
but lex matches the whole outer match and doen't return the two! I use this expression :
\/\*(.|\n)*\*\/
How to select inner matches instead of the whole outer one? thank you
Solution
\/\*([^*]|\n|\*+[^*/])*\*+\/
What's going on is that * is greedy -- it will match as long of a string as possible. The preceding expression treats the character * separately by ensuring that the regular expression can continue only as long as it is not followed by the character /. This is accomplished by having the interior units of the regular expression be one of
- a character that's not *
- a newline
- a string of *s followed by a character that's not /
At the end, there is a string of *s followed by a /. (Note: a previous version did not handle this case correctly. I really wish that flex had the *? operator.)