Question

I have this basic JFlex lexer :

import java.util.*;
%%

%public
%class TuringLexer
%type Void

%init{
yybegin(YYINITIAL);
%init}

%state COMM, GETALPH, MT, PARSELOOP, PARSELEMS, PARSESYMB, PARSEMT
%{
  ArrayList<Character> alf = new ArrayList<Character>();   
  String crtMach;
  String crtLoop;
  String crtLoopContent;
  String crtLoopContentParam;
  String crtContent;
  String crtSymb;
%}

//Input = [^\r\n]
SEP = [:space:]*
//COMM =[;.*$] 
name = [A-Za-z_]*
tok=[A-Za-z0-9#$@\*]
AL = "alphabet :: "
cont = [^]]*
param =[^)]*
letter = [A-Za-z]
opn = [\[?]
symb = [^\}]+
%%
 <COMM> {
  "."  { /* ignore */  System.out.println("Got into comm state ");}
  "\n" {System.out.println("Got out of comm state ");yybegin(YYINITIAL);}
}
 <GETALPH> {
 {SEP} { /* ignore */ }
 {tok} { String str = yytext();
     System.out.println("Alphabet -- " + str);
     Character c = str.charAt(0);
     alf.add(c); }
 ";"  {yybegin(YYINITIAL);}

}
 <YYINITIAL> {
 "\n"   { /* ignore */ System.out.println("Got into YYINITIAL"); }
 ";"  { yybegin(COMM); }

[^]                    { throw new Error("Illegal character <"+yytext()+">"); }
}

Code has been removed for clarity, but the issue still persists so it is easier to identify it here.

this is the input file -> file is called simple.mt

And this is the main class :

import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.Reader;
import java.io.BufferedReader;
import java.io.FileReader;
public class MainClass  {
public static void main(String args[]) throws IOException {
    Reader reader = new BufferedReader(new FileReader ("simple.mt"));
    reader.read();
    TuringLexer tl = new TuringLexer(reader);
    tl.yylex();
}
}

When I run the project in eclipse ( or terminal, for that matter) I get:

Exception in thread "main" java.lang.Error: Illegal character <l>
    at TuringLexer.yylex(TuringLexer.java:576)
    at MainClass.main(MainClass.java:11)

I have no idea what the error means and how can I debug it, what remained from the jflex file is a small sample so the error shouldn't be that hard to figure out

Was it helpful?

Solution

So you have a character appearing in your input that you don't know how to handle.

All lex files should have a final . rule that either prints an 'illegal character' error message (not a thrown exception), or else just returns yytext[0] to the parser for the parser to deal with.

The latter strategy also saves you from having to write a rule for each special character, for example =, + and so on: the parser should just use them as '=', '+', etc. Then (a) any illegal character just becomes a syntax error, but more importantly (b) the parser gets to use its error recovery, rather than just throwing the token away.

OTHER TIPS

You either do not show all the grammar or the grammar is incomplete.

Exception in thread "main" java.lang.Error: Illegal character <l>

This message tell that you don't handle loop keywords.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top