Question

I'm trying to use Ragel to write a simple Lexer, and output it to Java valid code, but the generated code does not compile.

Here's the Lexer.rl that I'm using:

public class Lexer {
    %%{
      machine simple_lexer;

      integer     = ('+'|'-')?[0-9]+;
      float       = ('+'|'-')?[0-9]+'.'[0-9]+;
      assignment  = '=';
      identifier  = [a-zA-Z][a-zA-Z_]+; 

      main := |*
        integer => { emit("integer"); };
        float => { emit("float"); };
        assignment => { emit("assignment"); };
        identifier => { emit("identifier"); };
        space => { emit("space"); };
      *|;

    }%%

    %% write data;

    public static void emit(String token) {
        System.out.println(token);
    }

    public static void main(String[] args) {
        %% write init;

        %% write exec;
    }
}

The generated file and the error output are in: https://gist.github.com/3495276 (because it's too large to paste here =S )

So, what am I doing wrong?

Was it helpful?

Solution

You need to declare certain variables that will be used in the generated code. Refer to the section 5.1 "Variables used by Ragel" of user guide.

main should look like this:

public static void main(String[] args) {
    int cs; /* state number */
    char[] data = "xy = 22 wq = 11.46".toCharArray(); /* input */
    int p = 0, /* start of input */
        pe = data.length, /* end of input */
        eof = pe,
        ts, /* token start */
        te, /* token end */
        act /* used for scanner backtracking */;

    %% write init;

    %% write exec;
}

Also, not sure if you really want identifiers to be at least two symbols long.

identifier  = [a-zA-Z][a-zA-Z_]+;

probably should be

identifier  = [a-zA-Z][a-zA-Z_]*;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top