Question

I was using the standard \b word boundary. However, it doesn't quite deal with the dot (.) character the way I want it to.

So the following regex:

\b(\w+)\b

will match cats and dogs in cats.dog if I have a string that says cats and dogs don't make cats.dogs.

I need a word boundary alternative that will match a whole word only if:

  1. it does not contain the dot(.) character
  2. it is encapsulated by at least one space( ) character on each side

Any ideas?!

P.S. I need this for PHP

Was it helpful?

Solution

You could try using (?<=\s) before and (?=\s) after in place of the \b to ensure that there is a space before and after it, however you might want to also allow for the possibility of being at the start or end of the string with (?<=\s|^) and (?=\s|$)

This will automatically exclude "words" with a . in them, but it would also exclude a word at the end of a sentence since there is no space between it and the full stop.

OTHER TIPS

What you are trying to match can be done easily with array and string functions.

$parts = explode(' ', $str);
$res = array_filter($parts, function($e){
   return $e!=="" && strpos($e,".")===false;
});

I recommend this method as it saves time. Otherwise wasting few hours to find a good regex solution is quite unproductive.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top