ANTLR su un flusso di dati rumoroso Parte 2

https://stackoverflow.com/questions/4325011

29-09-2019
|

Domanda

A seguito di una discussione molto interesing con Bart Kiers su analisi di un rumoroso flusso di dati con ANTLR , sto finendo con un altro problema ...

L'obiettivo è sempre lo stesso: solo l'estrazione di informazioni utili con la seguente grammatica,

VERB            : 'SLEEPING' | 'WALKING';
SUBJECT         : 'CAT'|'DOG'|'BIRD'; 
INDIRECT_OBJECT : 'CAR'| 'SOFA';  
ANY             : . {skip();};

parse 
  :  sentenceParts+ EOF 
  ;

sentenceParts  
  :  SUBJECT VERB INDIRECT_OBJECT  
  ;

una frase come it's 10PM and the Lazy CAT is currently SLEEPING heavily on the SOFA in front of the TV. produrrà il seguente

alt text

Questo è perfetto e che sta facendo esattamente quello che voglio .. da una grande pena, sto estraendo solo le parole che avevano un senso per me .... Ma la, ho fondato il seguente errore. Se in qualche parte del testo che sta introducendo una parola che inizia esattamente come un token, sto finendo con un MismathedTokenException o un noViableException

    it's 10PM and the Lazy CAT is currently SLEEPING heavily, 
    with a DOGGY bag, on the SOFA in front of the TV.

produrre un errore:

alt text

DOGGY viene interpretato come l'inizio per DOG che è anche una parte della SUBJECT token e il lexer è perduto ... Come avrei potuto evitare questo senza definire DOGGY come segno speciale ... mi sarebbe piaciuto il parser per capire DOGGY come una parola in sé.

Soluzione

Bene, sembra che l'aggiunta di questo ANY2 :'A'..'Z'+ {skip();}; risolve il mio problema!

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow