This question is merely for me as I always like to write optimized code that can run also on cheap slow servers (or servers with A LOT of traffic)
I looked around and I was not able to find an answer. I was wondering what is faster between those two examples keeping in mind that the array's keys in my case are not important (pseudo-code naturally):
<?php $a = array(); while($new_val = 'get over 100k email addresses already lowercased'){ if(!in_array($new_val, $a){ $a[] = $new_val; //do other stuff } } ?> <?php $a = array(); while($new_val = 'get over 100k email addresses already lowercased'){ if(!isset($a[$new_val]){ $a[$new_val] = true; //do other stuff } } ?>
As the point of the question is not the array collision, I would like to add that if you are afraid of colliding inserts for $a[$new_value]
, you can use $a[md5($new_value)]
. it can still cause collisions, but would take away from a possible DoS attack when reading from an user provided file (http://nikic.github.com/2011/12/28/Supercolliding-a-PHP-array.html)
in_array will be faster for large numbers of items. "large" being very subjective based on a lot of factors related to the data and your computer.
The array_search and in_array with $strict = true parameter are the slowest methods in our test.
The main difference between both the functions is that array_search() usually returns either key or index whereas in_array() returns TRUE or FALSE according to match found in search. Value: It specifies the value that needs to be searched in an array.
The in_array() function searches an array for a specific value. Note: If the search parameter is a string and the type parameter is set to TRUE, the search is case-sensitive.
The answers so far are spot-on. Using isset
in this case is faster because
in_array
must check every value until it finds a match.in_array
built-in function.These can be demonstrated by using an array with values (10,000 in the test below), forcing in_array
to do more searching.
isset: 0.009623 in_array: 1.738441
This builds on Jason's benchmark by filling in some random values and occasionally finding a value that exists in the array. All random, so beware that times will fluctuate.
$a = array(); for ($i = 0; $i < 10000; ++$i) { $v = rand(1, 1000000); $a[$v] = $v; } echo "Size: ", count($a), PHP_EOL; $start = microtime( true ); for ($i = 0; $i < 10000; ++$i) { isset($a[rand(1, 1000000)]); } $total_time = microtime( true ) - $start; echo "Total time: ", number_format($total_time, 6), PHP_EOL; $start = microtime( true ); for ($i = 0; $i < 10000; ++$i) { in_array(rand(1, 1000000), $a); } $total_time = microtime( true ) - $start; echo "Total time: ", number_format($total_time, 6), PHP_EOL;
Which is faster:
isset()
vsin_array()
isset()
is faster.
While it should be obvious, isset()
only tests a single value. Whereas in_array()
will iterate over the entire array, testing the value of each element.
Rough benchmarking is quite easy using microtime()
.
Total time isset(): 0.002857 Total time in_array(): 0.017103
Note: Results were similar regardless if existed or not.
<?php $a = array(); $start = microtime( true ); for ($i = 0; $i < 10000; ++$i) { isset($a['key']); } $total_time = microtime( true ) - $start; echo "Total time: ", number_format($total_time, 6), PHP_EOL; $start = microtime( true ); for ($i = 0; $i < 10000; ++$i) { in_array('key', $a); } $total_time = microtime( true ) - $start; echo "Total time: ", number_format($total_time, 6), PHP_EOL; exit;
I'd encourage you to also look at:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With