QString::replace(const QRegExp &, const QString &) and QString::replace(const QRegularExpression &, const QString &) work differently

StackOverflow https://stackoverflow.com/questions/17280005

  •  01-06-2022
  •  | 
  •  

Question

Why with Qt 5.1.0 Release Candidate the function QString::replace(const QRegExp & rx, const QString & after) treats the \v in a manner and QString::replace(const QRegularExpression & re, const QString & after) in another way? That’s the peace of code that I used:

QString ss("a\t\v  bc \t cdef\vg\r\r\t hi");
QString ss1(ss);
ss1.replace(QRegExp("\\s{2,}"), " ");
QString ss2(ss);
ss2.replace(QRegularExpression("\\s{2,}"), " ");

The values with debugger are:

ss  "a\t\013  bc \t cdef\013g\r\r\t hi"
ss1 "a bc cdef\013g hi"
ss2 "a\t\013 bc cdef\013g hi"

Thank you

Was it helpful?

Solution

QRegExp uses the Unicode "separator" category for \s. This includes \v.

QRegularExpression is a wrapper around PCRE, where the documentation states (http://pcre.org/pcre.txt):

For compatibility with Perl, \s does not match the VT character (code 11). This makes it different from the the POSIX "space" class. The \s characters are HT (9), LF (10), FF (12), CR (13), and space (32). If "use locale;" is included in a Perl script, \s may match the VT charac- ter. In PCRE, it never does.

Although the documentation says it never matches \v, you could try passing the UseUnicodePropertiesOption option to the QRegularExpression, which changes the character classes to use the Unicode properties, so in theory, unless a specific exception is built into PCRE, \s should match \v.

Failing that, you can use (\h|\v) (in C++ string form that's "(\\h|\\v)"), using PCRE's special "horizontal space" and "vertical space" classes.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top