Question

Both expressions work for me:

E1=> work(?:\s+)?group 

E2=> work(\s+)?group

I need to capture workgroup OR work group considering the space could be a line break (\s+)?

However, the first expression has a non-capture groups (?: and I am wondering if it is worse or better in the performance/fast output of the Regex. In other words, in terms of performance, what is the best expression?

Was it helpful?

Solution

The answer actually depends on the internals of the regex engine you are using.

In Javascript, I don't know which is faster.

In PHP, a capture group can be a bit faster. Here is a simple test with a simplified version of your regex.

<?php
$string = "WORD1".str_repeat(" someword",100000);
$regex1="~WORD1(?:\s+\w+){0,2}~";
$regex2="~WORD1(\s+\w+){0,2}~";

$start=microtime(TRUE);
for ($i=1;$i<1000000;$i++) preg_match($regex1,$string);
$noncapend=microtime(TRUE);
for ($i=1;$i<1000000;$i++) preg_match($regex2,$string);
$withcapend=microtime(TRUE);
$noncap = $noncapend-$start;
$withcap = $withcapend-$noncapend;
$diff = 100*($withcap-$noncap)/$noncap;
echo "Non-Capture Group: ".$noncap."<br />";
echo "Capture Group: ".$withcap."<br />";
echo "difference: ".$diff." percent longer<br />";

?>

The Output:

Note that you will get different results every time.

Non-Capture Group: 1.092001914978
Capture Group: 1.0608019828796
difference: -2.857131628658 percent longer
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top