Question

I'm trying to write a spirit grammar component that works with a lexer (when part of a larger project) or just with qi::parsers (such as int_) for testing.

Below is a sample parser (a really verbose way to parse an int). The problem is the lex_int function. I would like to second overload to be used if the token is qi::unused_type (no lexer) & the first for lexer provided. I figure I must use some template or MPL technique, since tok.integer_ is a compile error for qi::unused_type.

As an aside, even with USE_LEXER defined, it now drops core. Inlining the code with preprocessor defines works fine, but that seems so last century.

namespace qi    = boost::spirit::qi;
namespace lex   = boost::spirit::lex;

#define USE_MYINT
#define USE_LEXER

// my grammar replacing int_
template<typename Iterator, typename Skipper=qi::space_type>
struct my_int : qi::grammar<Iterator, int(), Skipper>
{
    qi::rule<Iterator, int(), Skipper> start;

    template<typename TokenDef>
    my_int(TokenDef &tok): my_int::base_type(start)
    {
        start %= lex_int(tok);
        BOOST_SPIRIT_DEBUG_NODE(start);
    }

    // overload for lexer
    template<typename TokenDef>
    decltype(start) lex_int(TokenDef &tok)
    {
        return tok.integer_;
    }

    // overload for no lexer
    // template<typename TokenDef>
    decltype(start) lex_int(qi::unused_type)
    {
        return qi::int_;
    }
};

A full (compilable) example is at dual_grammar.cc The example works with USE_MYINT and USE_LEXER defined & undefined. The goal is automatic selection via the USE_AUTO_SELECT symbol.

Was it helpful?

Solution

Just do what works, not what is fancy. Believe me, this will hurt you more than you can anticipate (including that horrifying class of bugs that doesn't manifest until your application crashes in production).

Tip 1: With Spirit, colour within the lines

Fact of the matter is that you can't really return Proto-based expression templates by value, because they're steeped in references to temporaries. Those are not meant to live beyond the end of the full-expression that contained them (This is typical of expression templates: they're fake expressions, but they can contain literals that live as temporaries during the construction of the template expression, and right until parser::compile()).

It is for this reason that any attempt to use runtime factories (like your lex_int) lead to pain.

Also it looks a bit yesterday :)

To alleviate the problems, you could move all decisions to compile time (I know you tried to, but still you were passing rules by value, which isn't static, because Spirit V2 wasn't written in a time when everything could be constexpr proofed. If you look ate Proto-0x you will find that that is the future of the libraries).

So, actually you could specialize the grammar on a trait that detects that the Iterator is a token iterator.

Note that you might want to take that opportunity to disable the Skipper too, because using the qi::space_type parser as skipper doesn't usually make sense with a lexer.

Honestly, I'd just write separate parsers. Or better yet, commit to one. And my preference would strongly lie with the Qi-only parser, because it leads to much more flexibility.

Tip 2: If not for flexibility, why [do we] use Spirit?

If I really needed to cast a grammar in stone and require ultimate performance I'd use a parser generator like ANTLR or CoCo/R, or even hand-roll my parser.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top