Question

I need to replace characters in a QString based on their QChar::category. In stdlib terms I want to

string.erase(std::remove_if(begin(string), end(string), 
                            [](QChar c) { 
                            QChar::Category cat = c.category(); 
                            return cat == .... || cat == ...; }), 
             string.end());

Alternatively, I'm happy with a regexp that works on unicode character categories that I can use for QString::replace.

Is that possible with QString or do I really need to turn the string in a std::vector<QChar> and back?

Edit: The categories I want to keep:

  • for the first charater: $, _, or any character in the Unicode categories “Uppercase letter (Lu)”, “Lowercase letter (Ll)”, “Titlecase letter (Lt)”, “Modifier letter (Lm)”, “Other letter (Lo)”, or “Letter number (Nl)”
  • for the rest: the first bullet plus any U+200C zero width non-joiner characters, U+200D zero width joiner characters, and characters in the Unicode categories “Non-spacing mark (Mn)”, “Spacing combining mark (Mc)”, “Decimal digit number (Nd)”, or “Connector punctuation (Pc)”.

I can do first/rest in multiple passes.

Was it helpful?

Solution

Qt provides its own ways to do such things. Whether it is good, or not is doubtful, but Qt idiomatic would be

QString result;
result.reserve(string.size());
foreach (const QChar& c, string) {
    if (is_good(c)) {
        result += c;
    }
}

Of course, you can do it with lambdas and std::for_each

std::for_each(string.begin(), string.end(),
                  [&result](QChar c)
                    {
                        if (is_good(c)) {result += c; }
                    }
    );

but it is not Qt idiomatic.

Note, that removing symbols from a string is slower, then adding new, if space was reserved, that is why the first code sample is fast.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top