Which is faster: in_array() or a bunch of expressions in PHP?

https://stackoverflow.com/questions/324665

11-07-2019
|

Question

Is it faster to do the following:

 if ($var != 'test1' && $var != 'test2' && $var != 'test3' && $var != 'test4') { ... }

Or:

 if (!in_array($var, array('test1', 'test2', 'test3', 'test4') { ... }

Is there a number of values at which point it's faster to do one or the other?

(In this case, the array used in the second option doesn't alreay exist.)

Solution

i'd strongly suggest just using in_array(), any speed difference would be negligible, but the readability of testing each variable separately is horrible.

just for fun here's a test i ran:

$array = array('test1', 'test2', 'test3', 'test4');
$var = 'test';
$iterations = 1000000;

$start = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
    if ($var != 'test1' && $var != 'test2' && $var != 'test3' && $var != 'test4') {}
}
$end = microtime(true);

print "Time1: ". ($end - $start)."<br />";

$start2 = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
    if (!in_array($var, $array) ) {}
}
$end2 = microtime(true);

print "Time2: ".($end2 - $start2)."<br />";

// Time1: 1.12536692619
// Time2: 1.57462596893

slightly trivial note to watch for, if $var is not set, method 1 takes much longer (depending on how many conditions you test)

OTHER TIPS

Note that if you're looking to replace a bunch of !== statements, you should pass the third parameter to in_array as true, which enforces type checking on the items in the array.

Ordinary != doesn't require this, obviously.

The first will be faster - the second has a lot of overhead: creating the array, calling a function, searching the array...

However, as I said in a question a couple of answers down, premature optimization is the root of all evil. You should write your code to be readable, then if it needs to be optimized profile it, then optimize.

Edit:

My timings with @Owen's code (PHP 5.2.6 / windows):

Time1: 1.33601498604
Time2: 4.9349629879

Moving the array(...) inside the loop, as in the question:

Time1: 1.34736609459
Time2: 6.29464697838

in_array will be faster for large numbers of items. "large" being very subjective based on a lot of factors related to the data and your computer. Since you are asking, I assume you are not dealing with a trivial number of items. For longer lists, heed this information, and measure performance with a flipped array so that php can utilize hash lookups instead of a linear search. For a "static" array that tweak may not improve performance, but it also may.

Using Owen's test code, with a flipped array and more iterations for more consistent results:

$array2 = array_flip($array);
$iterations = 10000000;

$start = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
    if (!isset($array2[$var])) {}
}
$end = microtime(true);
print "Time3: ".($end - $start)."<br />";

Time1: 12.875
Time2: 13.7037701607
Time3: 3.70514011383

Hi I just took this case to extremes and pointed out that with increasing number of values plain comparison is not the most performant way.

Here is my code:

$var = 'test';
$num_values = 1000;
$iterations = 1000000;
print "\nComparison performance test with ".$num_values." values and ".$iterations." loop iterations";
print "\n";

$start = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
    if ($var != 'test0' &&
        $var != 'test1' &&
        // ...
        // yes I really have 1000 lines in my file
        // ...
        $var != 'test999') {}
}
print "\nCase 1: plain comparison";
print "\nTime 1: ". (microtime(true) - $start);
print "\n";

$start = microtime(true);
$array = array();
for($i=0; $i<$num_values; $i++) {
    $array1[] = 'test'.$i;
}
for($i = 0; $i < $iterations; ++$i) {
    if (!in_array($var, $array1) ) {}
}
print "\nCase 2: in_array comparison";
print "\nTime 2: ".(microtime(true) - $start);
print "\n";

$start = microtime(true);
$array = array();
for($i=0; $i<$num_values; $i++) {
    $array2['test'.$i] = 1;
}
for($i = 0; $i < $iterations; ++$i) {
    if (!isset($array2[$var])) {}
}
print "\nCase 3: values as keys, isset comparison";
print "\nTime 3: ".(microtime(true) - $start);
print "\n";

$start = microtime(true);
$array = array();
for($i=0; $i<$num_values; $i++) {
    $array3['test'.$i] = 1;
}
for($i = 0; $i < $iterations; ++$i) {
    if (!array_key_exists($var, $array3)) {}
}
print "\nCase 4: values as keys, array_key_exists comparison";
print "\nTime 4: ".(microtime(true) - $start);
print "\n";

My Results (PHP 5.5.9):

Case 1: plain comparison
Time 1: 31.616894006729

Case 2: in_array comparison
Time 2: 23.226133823395

Case 3: values as keys, isset comparison
Time 3: 0.050863981246948

Case 4: values as keys, array_key_exists comparison
Time 4: 0.13700890541077

I agree, thats a little extreme but it shows the big picture and the great potential in the hash-table-like associative arrays of PHP, you just have to use it

Note that as RoBorg pointed out, there's overhead in creating the array so it should be moved inside the iteration loop. For this reason, Sparr's post is also a little misleading as there's overhead with the array_flip function.

Here's another example with all 5 variations:

$array = array('test1', 'test2', 'test3', 'test4');
$var = 'test';
$iterations = 1000000;

$start = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
   if ($var != 'test1' && $var != 'test2' && $var != 'test3' && $var != 'test4') {}
}
print "Time1: ". (microtime(true) - $start);

$start = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
   if (!in_array($var, $array) ) {}
}
print "Time2: ".(microtime(true) - $start);

$start = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
   if (!in_array($var, array('test1', 'test2', 'test3', 'test4')) ) {}
}
print "Time2a: ".(microtime(true) - $start);

$array2 = array_flip($array);
$start = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
  if (!isset($array2[$var])) {}
}
print "Time3: ".(microtime(true) - $start);

$start = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
    $array2 = array_flip($array);
  if (!isset($array2[$var])) {}
}
print "Time3a: ".(microtime(true) - $start);

My results:

Time1 : 0.59490108493 // straight comparison
Time2 : 0.83790588378 // array() outside loop - not accurate
Time2a: 2.16737604141 // array() inside loop
Time3 : 0.16908097267 // array_flip outside loop - not accurate
Time3a: 1.57209014893 // array_flip inside loop

In summary, using array_flip (with isset) is faster than inarray but not as fast as a straight comparison.

When speaking of PHP, and asking whether:

a set of "if"s and "else ifs" ,
an "if" with a set of "or"ed conditions (as in the original post details) , or
use of "in_array" with an on-the-fly constructed array ,

is better,

one should keep in mind that the PHP language "switch" statement is an alternative designed for such situations and may be a better answer. (Although the poster's example leads us to just comparing two solutions, the actual question heading asks to consider in_array versus PHP statements, so I think this is fair game).

In the poster's example, then, I would instead recommend:

switch ($var)
{ case 'test1': case 'test2': case 'test3': case 'test4':
     echo "We have a good value"; break;
  default:
     echo "We do not have a good value";
}

I wish PHP allowed for a couple of non-primitive constructs in the cases, such as a comma for "or". But the above is what the designers of PHP considered to be the clearest way of handling this. And it appears to be more efficient at execution time than the other two alternatives, as well.

As long as I'm talking about a wishlist, the "IN" found in SQL would be even clearer for the poster's example situation.

This thinking is probably what leads to people wanting to use "in_array", for such situations, but it is kind of unfortunate to have to build a data structure and then use a predicate designed for that data structure, rather than having a way to just say it without that overhead happening.

Here is an live update of this bench with another case https://3v4l.org/OA2S7

The results for PHP 7.3:

multiple comparisons: 0.057507991790771
in_array: 0.02568507194519
array_flip() outside loop measured + isset(): 0.014678001403809
array_flip() outside loop not measured + isset(): 0.015650033950806
foreach and comparison: 0.062782049179077

I know this question is nearly 10 years old, but there are other ways to do this. I used method B from Nick's page with thousands of entries. It was incredibly fast.

foreach(array_values($haystack) as $v)
    $new_haystack[$v] = 1; 
}

// So haystack becomes:
$arr[“String1”] = 1;
$arr[“String2”] = 1;
$arr[“String3”] = 1;


// Then check for the key:
if (isset($haystack[$needle])) {
    echo("needle ".$needle." found in haystack");
}

My Testing

$array = array('test1', 'test2', 'test3', 'test4');
$var = 'test';
$iterations = 1000000;

$start = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
    if ($var != 'test1' && $var != 'test2' && $var != 'test3' && $var != 'test4') {}
}
$end = microtime(true);

print "Time1: ". ($end - $start)."<br />";

$start2 = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
    if (!in_array($var, $array) ) {}
}
$end2 = microtime(true);

print "Time2: ".($end2 - $start2)."<br />";

$array_flip = array_flip($array);

$start = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
    if (!isset($array_flip[$var])) {}
}
$end = microtime(true);
print "Time3: ".($end - $start)."<br />";

$start = microtime(true);
for($i = 0; $i < $iterations; ++$i) {
    if (!isset($array[$var])) {}
}
$end = microtime(true);

print "Time4: ". ($end - $start)."<br />";

Time1: 0.20001101493835

Time2: 0.32601881027222

Time3: 0.072004079818726

Time4: 0.070003986358643

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow