Question

As I see in regex documentation that \b match word boundary. I prepare a string

"db bd how to"

and regex

\b(a|b)(c|d)\b

I think when running the regex, it should match "bd" in the string, but it don't.

But if the regex is

\\b(a|b)(c|d)\\b

it matches.

Can you explain the difference?

Was it helpful?

Solution

There are several layers of escaping at work here

regex -----------------> | regex literal -> | string literal
-------------------------+------------------+---------------
word boundary            | \b               | \\b
alternation ("a" or "b") | (a|b)            | (a|b)
alternation ("c" or "d") | (c|d)            | (c|d)
word boundary            | \b               | \\b

OTHER TIPS

If that is in a C or C++ program, the compiler itself might interpret the '\b' and substitute it during compilation, so you need to escape it from the compiler first.

So if you have

char *regex = "\\bword";

in your source file, and pass it to a regex function, it will receive '\bword'.


Edit

I have no idea if it holds for Objective-C, I have no experience with it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top