The most common architecture for this type of parser, to run the lexer inside your parser. Every time you need a token , make a call to a function (from lexer) that retrieves the next one.
I don't know Antlr, but I think they all uses the same. What I'm proposing is how the yacc and lex work.