How do I parse this correctly with spirit?
-
19-08-2019 - |
Question
My situation: I'm new to Spirit, I have to use VC6 and am thus using Spirit 1.6.4.
I have a line that looks like this:
//The Description;DESCRIPTION;;
I want to put the text DESCRIPTION
in a string if the line starts with //The Description;
.
I have something that works but looks not that elegant to me:
vector<char> vDescription; // std::string doesn't work due to missing ::clear() in VC6's STL implementation
if(parse(chars,
// Begin grammar
(
as_lower_d["//the description;"]
>> (+~ch_p(';'))[assign(vDescription)]
),
// End grammar
space_p).hit)
{
const string desc(vDescription.begin(), vDescription.end());
}
I would much more like to assign all printable characters up to the next ';'
but the following won't work because parse(...).hit == false
parse(chars,
// Begin grammar
(
as_lower_d["//the description;"]
>> (+print_p)[assign(vDescription)]
>> ';'
),
// End grammar
space_p).hit)
How do I make it hit?
Solution
You might try using confix_p
:
confix_p(as_lower_d["//the description;"],
(+print_p)[assign(vDescription)],
ch_p(';')
)
It should be equivalent to Fred's response.
The reason your code fails is because print_p
is greedy. The +print_p
parser will consume characters until it encounters the end of the input or a non-printable character. Semicolon is printable, so print_p
claims it. Your input gets exhausted, the variable is assigned, and the match fails — there's nothing left for the last semicolon of your parser to match.
Fred's answer constructs a new parser, (print_p - ';')
, which matches everything print_p
does, except for semicolons. "Match everything except X, and then match X" is a common pattern, so confix_p
is provided as a shortcut for constructing that kind of parser. The documentation suggests using it for parsing C- or Pascal-style comments, but that's not required.
For your code to work, Spirit would need to recognize that the greedy print_p
matched too much and then backtrack to allow matching less. But although Spirit will backtrack, it won't backtrack to the "middle" of what a sub-parser would otherwise greedily match. It will backtrack to the next "choice point," but your grammar doesn't have any. See Exhaustive backtracking and greedy RD in the Spirit documentation.
OTHER TIPS
You're not getting a hit because ';' is matched by print_p. Try this:
parse(chars,
// Begin grammar
(
as_lower_d["//the description;"]
>> (+(print_p-';'))[assign(vDescription)]
>> ';'
),
// End grammar
space_p).hit)