Question

I'm converting an application from the .Net framework to Qt using C++. The application makes extensive use of regular expression unicode properties, i.e. \p{L}, \p{M}, etc. I've just discovered that the QRegExp class lacks support for this among other things (lookbehinds, etc.)

Can anyone recommend a C++ regular expression library that:

  • Supports unicode properties
  • Is unicode-aware in other respects (i.e. \w matches more than ASCII word characters)
  • As a bonus, supports lookbehinds.

Please don't point me to the wikipedia article; I don't trust it. That article says that QRegExp supports unicode properties. Unless I'm really doing something wrong, it doesn't. I'm looking for someone actually using unicode properties with a regex library in a project.

Was it helpful?

Solution

http://site.icu-project.org/

ICU is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications.

  • released under a nonrestrictive open source license
  • ...
  • Regular Expression: ICU's regular expressions fully support Unicode while providing very competitive performance.

It's also compatible to Boost, see their statement in this regard.

OTHER TIPS

There should be nothing stopping you from using PCRE (http://www.pcre.org/), though converting back and forth from QStrings to const char *s could be a pain/performance hit.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top