JavaCC - How to store tokens for later parse?

https://stackoverflow.com/questions/17229227

01-06-2022
|

Question

I'm trying to parse my input partially, so that I can store certain chunks for a later parse.

void start():{}
{
    stmt()*
}

void stmt():{}
{
    "parse:" expr_later() ";"
}

void expr_later():{}
{
    (
    expr();
    // store tokens from expr() in a list for later processing....
    )*
}

void expr():{}
{
    "{" expr() "}"
|    <ANY:~[]>
}

In this case, the "ANY" token will only be valid if previous tokens didn't match anything else, but assuming I have many more token definitions, the grammar above won't do.

I know that ~[] matches any character and not any token.

Further, let's say I would use token states instead (stuff they do with javadoc, pragmas etc.), I would still have a problem capturing the chunks, since I don't have any token to set my special token state. Also, setting the token state via the parser seems to be a bad practice according to JavaCC's FAQ, since the TokenManager might already have some tokens in its queue.

So I'm wondering if there's any ANY-equivilent regarding tokens. Or does someone at least have an idea how to approach my problem in a different way?

Solution

Of course one way to do it is to make a big production that lists every kind of token except "{" and "}".

Token any() :{Token t;}{ (t=<NUMBER> | t=<IDENTIFIER> | t="(" | ... | ) {return t;} }

But that's not at all elegant.

Instead, you can write a JAVACODE production that consumes tokens until the final close-brace is found. See https://javacc.java.net/doc/javaccgrm.html#JAVACODE for a similar example.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow