Question

I am trying to get the simplest possible parser to work with JParsec 2.0.1, but am having no luck. I have the following AST classes:

public abstract class Node {
}

public final class ConstantNode extends Node {
    private final String value;

    public ConstantNode(String value) {
        this.value = value;
    }

    @Override
    public String toString() {
        return this.value;
    }
}

And the following test code:

import junit.framework.Assert;

import org.codehaus.jparsec.Parser;
import org.codehaus.jparsec.Parsers;
import org.codehaus.jparsec.Scanners;
import org.codehaus.jparsec.Terminals;
import org.codehaus.jparsec.Token;
import org.codehaus.jparsec.functors.Map;
import org.junit.Test;

import ast.ConstantNode;
import ast.Node;

public class ParserTest {
    private static final Parser<Token> CONSTANT_LEXER = Parsers
        .or(Terminals.StringLiteral.SINGLE_QUOTE_TOKENIZER,
            Terminals.StringLiteral.DOUBLE_QUOTE_TOKENIZER)
        .token();

    private static final Parser<Node> CONSTANT_PARSER = CONSTANT_LEXER.map(new Map<Token, Node>() {
        @Override
        public Node map(Token from) {
            return new ConstantNode(from.toString());
        }
    });

    private static final Parser<Void> IGNORED = Scanners.WHITESPACES;

    @Test
    public void testParser() {
        Object result = null;

        // this passes
        result = CONSTANT_LEXER.parse("'test'");
        Assert.assertEquals("test org.codehaus.jparsec.Token", result + " " + result.getClass().getName());

        // this fails with exception: org.codehaus.jparsec.error.ParserException: Cannot scan characters on tokens.
        result = CONSTANT_PARSER.from(CONSTANT_LEXER, IGNORED).parse("'test'");
        Assert.assertEquals("test ast.ConstantNode", result + " " + result.getClass().getName());
    }
}

Even though my lexer is successfully parsing the string input to tokens, my parser is unable to consume those tokens due to the JParsec exception. I've studied this code over and over and can only assume that either this is a jparsec bug, or I'm misunderstanding something obvious.

Can anyone tell me what I'm doing wrong here?

UPDATE: I believe the original problem is due to recursive references. My CONSTANT_PARSER is using the CONSTANT_LEXER, and then later I call CONSTANT_PARSER.from(CONSTANT_LEXER...). By changing my CONSTANT_PARSER to the following, my test then passed:

private static final Parser<Node> CONSTANT_PARSER = Parsers.tokenType(Token.class, "constant").map(new Map<Token, Node>() {
    @Override
    public Node map(Token from) {
        return new ConstantNode(from.toString());
    }
});

However, this still hasn't completely clicked for me. I suspect there's a better way to be doing this, so am still very interested in any ideas.

No correct solution

OTHER TIPS

You are mixing 2 different kind of "parsers": String parsers aka. Scanners in JParsec, and Token parsers:

CONSTANT_PARSER.from(CONSTANT_LEXER, IGNORED).parse("'test'");

essentially says that CONSTANT_PARSER should get its input as a stream of tokens from CONSTANT_LEXER with IGNORED as separators. Trouble is CONSTANT_PARSER is defined by "mapping" CONSTANT_LEXER which means it parses its input using the lexer than maps the result. This raises the following error:

org.codehaus.jparsec.error.ParserException: Cannot scan characters on tokens.

By defining CONSTANT_PARSER as Parsers.tokenType(Token.class, "constant") you effectively say the parser uses a stream of token, naming them constant. I think however that this does not work as you would expect as this would match any type of Token, not only constants.

This is certainly one of the less documented part of JParsec, with is not really well-documented either!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top