Replacing elements in a QString based on a predicate
Question
I need to replace characters in a QString
based on their
QChar::category
. In stdlib terms I want to
string.erase(std::remove_if(begin(string), end(string),
[](QChar c) {
QChar::Category cat = c.category();
return cat == .... || cat == ...; }),
string.end());
Alternatively, I'm happy with a regexp that works on unicode character
categories that I can use for QString::replace
.
Is that possible with QString or do I really need to turn the string
in a std::vector<QChar>
and back?
Edit: The categories I want to keep:
- for the first charater: $, _, or any character in the Unicode categories “Uppercase letter (Lu)”, “Lowercase letter (Ll)”, “Titlecase letter (Lt)”, “Modifier letter (Lm)”, “Other letter (Lo)”, or “Letter number (Nl)”
- for the rest: the first bullet plus any U+200C zero width non-joiner characters, U+200D zero width joiner characters, and characters in the Unicode categories “Non-spacing mark (Mn)”, “Spacing combining mark (Mc)”, “Decimal digit number (Nd)”, or “Connector punctuation (Pc)”.
I can do first/rest in multiple passes.
Solution
Qt provides its own ways to do such things. Whether it is good, or not is doubtful, but Qt idiomatic would be
QString result;
result.reserve(string.size());
foreach (const QChar& c, string) {
if (is_good(c)) {
result += c;
}
}
Of course, you can do it with lambdas and std::for_each
std::for_each(string.begin(), string.end(),
[&result](QChar c)
{
if (is_good(c)) {result += c; }
}
);
but it is not Qt
idiomatic.
Note, that removing symbols from a string is slower, then adding new, if space was reserved, that is why the first code sample is fast.