I'm sorry to break it to you, but your grammar is far more broken than you imagined.
conds = *(char_) // ...
Right here, you're basically just parsing all the input into a single string, with whitespace skipped. In fact, adding
for (auto& el : data)
std::cout << "'" << el << "'\n";
after parsing prints:
Pair Test
parse success
No Of Key-Value Pairs= 2
'book.author_id='1234'andbook.isbn='xy99'andbook.type='abc'andbook.lang='Eng''
''
As you can see, the first element is the string that *char_
parsed, and you get an empty element for free due to the fact that both conds
and cond
match on empty input.
I would strongly suggest you to start simple. And I mean, much simpler.
Slowly build your grammar up from the ground. Spirit is a very good tool to tackle with test-driven development (except for the compile times, but hey, you get more time to think!).
Here's something that I just made up, starting thinking from the very first building block, the indent
ifier, and working my way up to the higher-level elements:
// lexemes (no skipper)
ident = +char_("a-zA-Z.");
op = no_case [ lit("=") | "<>" | "LIKE" | "IS" ];
nulllit = no_case [ "NULL" ];
and_ = no_case [ "AND" ];
stringlit = "'" >> *~char_("'") >> "'";
// other productions
field = ident;
value = stringlit | nulllit;
condition = field >> op >> value;
conjunction = condition % and_;
start = conjunction;
These are close to the simplest thing that I suppose could parse your grammar (with a few creative notes left and right, where they don't seem too intrusive).
UPDATE So this is where I got in 20 minutes:
I always start out with mapping the types that I want the rules to expose:
namespace ast
{
enum op { op_equal, op_inequal, op_like, op_is };
struct null { };
typedef boost::variant<null, std::string> value;
struct condition
{
std::string _field;
op _op;
value _value;
};
typedef std::vector<condition> conditions;
}
Only condition
cannot be "naturally" used in a Spirit grammar without adaptation:
BOOST_FUSION_ADAPT_STRUCT(ast::condition, (std::string,_field)(ast::op,_op)(ast::value,_value))
Now comes the grammar itself:
// lexemes (no skipper)
ident = +char_("a-zA-Z._");
op_token.add
("=", ast::op_equal)
("<>", ast::op_inequal)
("like", ast::op_like)
("is", ast::op_is);
op = no_case [ op_token ];
nulllit = no_case [ "NULL" >> attr(ast::null()) ];
and_ = no_case [ "AND" ];
stringlit = "'" >> *~char_("'") >> "'";
//// other productions
field = ident;
value = stringlit | nulllit;
condition = field >> op >> value;
whereclause = condition % and_;
start = whereclause;
You can see minor deviations from my original sketch, that's interesting:
- added
_
to identifier chars - move
op_token
into a symbol matcher (because that's easier to map the enum values)
See it all Live And Working On Coliru, output:
Pair Test
parse success
No Of Key-Value Pairs= 4
( [book.author_id] = 1234 )
( [book.isbn] LIKE xy99 )
( [book.type] = abc )
( [book.lang] IS NULL )
book.author_id = '1234' and book.isbn liKE 'xy99' and book.type = 'abc' and book.lang IS null