Question

Just need to see if a paragraph contains a "stop word", the stop words are in an array below.

I had the formula as:

$pattern_array = array("preheat", "minutes", "stir", "heat", "put", "beat", "bowl", "pan");

    foreach ($pattern_array as $pattern) {
      if (preg_match('/'.$pattern.')/i', $paragraph)) {
        $stopwords = 1;
      }
    }

Which works well enough but for short words like 'pan' a word like 'panko' is identified as a stop word.

So the regex would be something like it has to have a space before it or be the start of a new line and either end in a full stop/space/comma/(other non character objects).

Also how could I tell php to exit the loop as soon as a stop word is identified?

Thanks guys, slowing learning regex as I go!

Was it helpful?

Solution

Use \b(preheat|minutes|stir|heat|put|bowl|pan)\b as your regex. That way, you only need one regex (no looping necessary), and by using the \b word boundary assertions, you make sure that only entire words match.

OTHER TIPS

Haven't tried this, but \b should be the character group you're looking for. From the PHP manual:

 \b   word boundary

Your code would then look something like this:

$pattern_array = array("preheat", "minutes", "stir", "heat", "put", "beat", "bowl", "pan");

foreach ($pattern_array as $pattern) {
  if (preg_match('/\b'.$pattern.'\b/i', $paragraph)) { // also removed the ')'
    $stopwords = 1;
    break; // to exit the loop
  }
}

Edit: seems people are better off using \b, so changed this accordingly

you need to add \b (which stands for word boundary) to your regex like this:

'/\b'.$pattern.'\b/i'

You seem to have a typo in your code, because either you have a literal closing bracket (and don't match parts of the words) or you have an open closing bracket.

1. You can use "\b" to check for word boundaries. A word boundary is defined as the boundary between a word character and a non-word character. word-characters are letters, numbers, and underscore.

2. You can do it all at one go, by using "|":

$stopwords = preg_match('/\\b(preheat|minutes|stir|heat|..other words..|pan)\\b/i', $paragraph)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top