Question

RE2 is a modern regular expression engine available from Google. I want to use RE2 in a program that is currently using gnuregex. The problem I have relates to finding out what matched. What RE2 returns is the string that matched. I need to know the offset of what matched. My current plan is to take what RE2 returns and then use a find on the C++ string. But this seems wasteful. I've gone through the RE2 manual and can't figure out how to do it. Any ideas?

Was it helpful?

Solution

Store the result in a re2::StringPiece instead of a std::string. The value of .data() will point into the original string.

Consider this program. In each of the tests, result.data() is a pointer into the original const char* or std::string.

#include <re2/re2.h>
#include <iostream>


int main(void) {

  { // Try it once with character pointers
    const char *text[] = { "Once", "in", "Persia", "reigned", "a", "king" };

    for(int i = 0; i < 6; i++) {
      re2::StringPiece result;
      if(RE2::PartialMatch(text[i], "([aeiou])", &result))
        std::cout << "First lower-case vowel at " << result.data() - text[i] << "\n";
      else
        std::cout << "No lower-case vowel\n";
    }
  }

  { // Try it once with std::string
    std::string text[] = { "While", "I", "pondered,", "weak", "and", "weary" };

    for(int i = 0; i < 6; i++) {
      re2::StringPiece result;
      if(RE2::PartialMatch(text[i], "([aeiou])", &result))
        std::cout << "First lower-case vowel at " << result.data() - text[i].data() << "\n";
      else
        std::cout << "No lower-case vowel\n";
    }
  }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top