I'm assuming that you are using JavaCC. The answer depends on the syntax of strings in your language. Let's say any character is allowed in a string other than an apostrophe. I.e. a string consists of two apostrophes and any number (0 or more) of nonapostrophes in between.
<STRING: "'" (~["'"])* "'">
Now many languages don't allow newlines or returns in strings. So here let's ban those too:
<STRING: "'" (~["'","\n","\r"])* "'">
Now the problem is: what if someone wants to put apostrophes, newlines or returns? Some languages (e.g. C) use backslashes as an escape, so let's say
- \' means apostrophe
- \n means newline
- \r means return
- \\ means backslash
- \x where x is any other character is considered an error
Here is the expression
<STRING: "'" ("\\" ("\\" | "n" | "r" | "'") | ~["\\","\n","\r","'"] )* "'"
I.e. a string is two apostrophes with a sequence of 0 or more groups in between, where each group is either one of the two character sequences \\, \n, \r, \', or a character that is not a backslash, a newline, a return or an apostrophe.
Another approach is to use lexical states.
<DEFAULT> MORE: { "'" : INSTRING }
<INSTRING> MORE: { "\\\\"
| "\\n"
| "\\r"
| "\\'"
| ~["\\","\n","\r","'"]
}
<INSTRING> TOKEN: { "'" : DEFAULT }