Parsing a tokenized free form grammar with Boost.Spirit

Question

The token attribute exposes a variant, which in addition to the base-iterator range, can _assume the types declared in the token_type typedef:

typedef lex::lexertl::token<base_iterator_type, mpl::vector<std::string, int, double>> token_type;

So: string, int and double. Note however that coercion into one of the possible types will only occur lazily, when the parser actually uses the value.

utrees are a very versatile container ^[1]. Hence, when you expose a spirit::utree attribute on a rule, and the token value variant contains an iterator_range, then it attempts to assign that into the utree object (this fails, because the iterators are ... 'funky').

The easiest way to get your desired behaviour is to force Qi to interpret the attribute of the tag token as a string, and have that assigned to the utree. Therefore the following line constitutes a fix that will make compilation succeed:

    unknowntagvalue = qi::as_string[tok.tag] >> restofline;

Notes

Having said all this, I would indeed suggest the following

Consider using the Nabialek Trick to dispatch different lazy rules depending on the tag matched - this makes it unnecessary to deal with raw utrees later on
You might have had success specializing boost::spirit::traits::assign_to_XXXXXX traits (see documentation)
consider using a pure Qi parser. While I can "feel" your sentiment that "it is going to brittle" ^[2] it seems you have already demonstrated that it raises the complexity to such a degree that it might not have net merit:
- the unexpected ways in which attributes materialize (this question)
- the problem with line-pos iterators (this is frequently asked question, and AFAIR it has mostly hard or inelegant solutions)
- the inflexibility regarding e.g. ad-hoc debugging (access to source data in SA), switching/disabling skippers etc.
- my personal experience was that looking at lexer states to drive these isn't helpful, because switching lexer state can only work reliably from lexer token semantic actions, whereas often, the disambiguation would happen in the Qi phase

but I'm diverging :)

^[1] e.g. they have facilities for very lightweight 'referencing' of iterator ranges (e.g. for symbols, or to avoid copying characters from a source buffer into the attribute unless wanted)

^[2] In effect, only because using a sequential lexer (scanner) vastly reduces the number of backtrack opportunities, so it simplifies the mental model of the parser. However, you can use expectation points to much the same effect.