Frage

Here is my test code:

$test = '@12345 abc @12 @abd engng@geneng';
preg_match_all('/(^|\s)@([^@ ]+)/', $test, $matches);
print_r($matches);

And the output $matches:

Array ( [0] => Array ( [0] => @12345 [1] => @12 [2] => @abd ) [1] => Array ( [0] => [1] => [2] => ) [2] => Array ( [0] => 12345 [1] => 12 [2] => abd ) )

My question is why does it have an empty row?

[1] => Array ( [0] => [1] => [2] => )

If I get ride of (^|\s) in the regex, the second row will disappear. However I would not able to prevent matching @geneng.

Any answer will be appreciated.

War es hilfreich?

Lösung

The problem with your regular expression is that it matches @ even when it is preceded by whitespace. Because \s will match the whitespace, it will be captured into $matches array. You can solve this problem by using lookarounds. In this case, it can be solved with a positive lookbehind:

preg_match_all('/(?<=^|\s)@([^@ ]+)/', $test, $matches);

This will match the part after @ only if it is preceded by a space or beginning-of-the line anchor. It's important to note that lookarounds do not actually consume characters. They just assert that the given regular expression is either followed or preceded by something.

Demo

Andere Tipps

It's because of the memory capture to test (^|\s):

preg_match_all('/(^|\s)@([^@ ]+)/', $test, $matches);
                 ^^^^^^

It's captured as memory location #1, so to avoid that you can simply use non-capturing parentheses:

preg_match_all('/(?:^|\s)@([^@ ]+)/', $test, $matches);
                  ^^

preg_match_all uses by default the PREG_PATTERN_ORDER flag. This means that you will obtain:

$matches[0] -> all substrings that matches the whole pattern
$matches[1] -> all capture groups 1
$matches[2] -> all capture groups 2
etc.

You can change this behavior using the PREG_SET_ORDER flag:

$matches[0] -> array with the whole pattern and the capture groups for the first result
$matches[1] -> same for the second result
$matches[2] -> etc.

In your code you (PREG_PATTERN_ORDER by default) you obtain $matches[1] with only empty or blank items because it is the content of capture group 1 (^|\s)

There is 2 set of parentheses that's why you get an empty row. PHP thinks, you want 2 set of matching in the string. Removing one of them will remove one array.

FYI: In this case, you can not use [^|\s] instead of (^|\s). Cause PHP will think, you want to exclude the white space.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top