Error avalanche in Boost.Spirit.Qi usage

https://stackoverflow.com/questions/2218100

19-09-2019
|

Question

I'm not being able to figure out what's wrong with my code. Boost's templates are making me go crazy! I can't make heads or tails out of all this, so I just had to ask.

What's wrong with this?

#include <iostream>
#include <boost/lambda/lambda.hpp>
#include <boost/spirit/include/qi.hpp>

void parsePathTest(const std::string &path)
{
    namespace lambda = boost::lambda;
    using namespace boost::spirit;

    const std::string permitted = "._\\-#@a-zA-Z0-9";
    const std::string physicalPermitted = permitted + "/\\\\";
    const std::string archivedPermitted = permitted + ":{}";

    std::string physical,archived;

    // avoids non-const reference to rvalue
    std::string::const_iterator begin = path.begin(),end = path.end();

    // splits a string like "some/nice-path/while_checking:permitted#symbols.bin"
    // as physical = "some/nice-path/while_checking"
    // and archived = "permitted#symbols.bin" (if this portion exists)
    // I could barely find out the type for this expression
    auto expr
        =   ( +char_(physicalPermitted) ) [lambda::var(physical) = lambda::_1]
            >> -(
                    ':'
                    >> (
                           +char_(archivedPermitted) [lambda::var(archived) = lambda::_1]
                       )
                )
        ;

    // the error occurs in a template instantiated from here
    qi::parse(begin,end,expr);

    std::cout << physical << '\n' << archived << '\n';
}

The number of errors is immense; I would suggest people who want to help compiling this on their on (trust me, pasting here is unpractical). I am using the latest TDM-GCC version (GCC 4.4.1) and Boost version 1.39.00.

As a bonus, I would like to ask another two things: whether C++0x's new static_assert functionality will help Boost in this sense, and whether the implementation I've quoted above is a good idea, or if I should use Boost's String Algorithms library. Would the latter likely give a much better performance?

Thanks.

-- edit

The following very minimal sample fails at first with the exact same error as the code above.

#include <iostream>
#include <boost/spirit/include/qi.hpp>

int main()
{
    using namespace boost::spirit;

    std::string str = "sample";
    std::string::const_iterator begin(str.begin()), end(str.end());

    auto expr
        =   ( +char_("a-zA-Z") )
        ;

    // the error occurs in a template instantiated from here
    if (qi::parse(begin,end,expr))
    {
        std::cout << "[+] Parsed!\n";
    }
    else
    {
        std::cout << "[-] Parsing failed.\n";
    }

    return 0;
}

-- edit 2

I still don't know why it didn't work in my old version of Boost (1.39), but upgrading to Boost 1.42 solved the problem. The following code compiles and runs perfectly with Boost 1.42:

#include <iostream>
#include <boost/spirit/include/qi.hpp>

int main()
{
    using namespace boost::spirit;

    std::string str = "sample";
    std::string::const_iterator begin(str.begin()), end(str.end());

    auto expr
        =   ( +qi::char_("a-zA-Z") ) // notice this line; char_ is not part of 
                                     // boost::spirit anymore (or maybe I didn't 
                                     // include the right headers, but, regardless, 
                                     // khaiser said I should use qi::char_, so here 
                                     // it goes)
        ;

    // the error occurs in a template instantiated from here
    if (qi::parse(begin,end,expr))
    {
        std::cout << "[+] Parsed!\n";
    }
    else
    {
        std::cout << "[-] Parsing failed.\n";
    }

    return 0;
}

Thanks for the tips, hkaiser.

Solution

Several remarks: a) don't use the Spirit V2 beta version distributed with Boost V1.39 and V1.40. Use at least Spirit V2.1 (as released with Boost V1.41) instead, as it contains a lot of bug fixes and performance enhancements (both, compile time and runtime performance). If you can't switch Boost versions, read here for how to proceed. b) Try to avoid using boost::lambda or boost::bind with Spirit V2.x. Yes, I know, the docs say it works, but you have to know what you're doing. Use boost::phoenix expressions instead. Spirit 'knows' about Phoenix, which makes writing semantic actions easier. If you use Phoenix, your code will look like:

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>

namespace qi = boost::spirit::qi;
namespace phoenix = boost::phoenix;

std::string physical, archived;  
auto expr 
    =   ( +char_(physicalPermitted) ) [phoenix::ref(physical) = qi::_1] 
    >> -( 
            ':' 
            >> ( +char_(archivedPermitted) )[phoenix::ref(archived) = qi::_1] 
        ) 
    ;

But your overall parser will get even simpler if you utilize Spirit's built-in attribute propagation rules:

std::string physical;
boost::optional<std::string> archived;  

qi::parse(begin, end, 
    +qi::char_(physicalPermitted) >> -(':' >> +qi::char_(archivedPermitted)),
    physical, archived);

i.e. no need to have semantic actions at all. If you need more information about the attribute handling, see the article series about the Magic of Attributes on Spirit's web site.

Edit:

Regarding your static_assert question: yes static_assert, can improve error messages as it can be used to trigger compiler errors as early as possible. In fact, Spirit uses this technique extensively already. But it is not possible to protect the user from getting those huge error messages in all cases, but only for those user errors the programmer did expect. Only concepts (which unfortunately didn't make it into the new C++ Standard) could have been used to generally reduce teh size of the error messages.

Regarding your Boost's String Algorithms question: certainly it's possible to utilize this library for simple tasks as yours. You might even be better off using Boost.Tokenizer (if all you need is to split the input string at the ':'). The performance of Spirit should be comparable to the corresponding performance of the string algorithms, but this certainly depends on the code you will write. If you assume that the utilized string algorithm will require one pass over the input string data, then Spirit won't be faster (as it's doing one pass as well).

Neither the Boost String algorithms nor the Boost Tokenizer can give you the verification of the characters matched. Your Spirit grammar matches only the characters you specified in the character classes. So if you need this matching/verification, you should use either Spirit or Boost Regex.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow