I have the following simple code to test against collision on a primary key I am creating:
$machine_ids = array();
for($i = 0; $i < 100000; $i++) {
//Generate machine id returns a 15 character alphanumeric string
$mid = Functions::generate_machine_id();
if(in_array($mid, $machine_ids)) {
die("Collision!");
} else {
$machine_ids[] = $mid;
}
}
die("Success!");
Any idea why this is taking many minutes to run? Anyway to speed it up?
If you need best performance for your case, you need store your data as array key and
use isset
or array_key_exists
(since php >= 7.4 array_key_exists
is now as fast as isset
) instead in_array
.
Attention. It is true that isset on a hash map is faster than searching through an array for a value (in_array), but keep in mind that converting an array of values, ["foo", "bar", "baz"], to a hash map, ["foo" => true, "bar" => true, "baz" => true], incurs a memory cost (as well as potentially constructing the hash map, depending on how and when you do it). As with all things, you'll have to weigh the pros & cons for each case to determine if a hash map or array (list) of values works best for your needs. This isn't specific to PHP but more of a general problem space of computer science.
And some performance tests from https://gist.github.com/alcaeus/536156663fac96744eba77b3e133e50a
<?php declare(strict_types = 1);
function testPerformance($name, Closure $closure, $runs = 1000000)
{
$start = microtime(true);
for (; $runs > 0; $runs--)
{
$closure();
}
$end = microtime(true);
printf("Function call %s took %.5f seconds\n", $name, $end - $start);
}
$items = [1111111];
for ($i = 0; $i < 100000; $i++) {
$items[] = rand(0, 1000000);
}
$items = array_unique($items);
shuffle($items);
$assocItems = array_combine($items, array_fill(0, count($items), true));
$in_array = function () use ($items) {
in_array(1111111, $items);
};
$isset = function () use ($assocItems) {
isset($items[1111111]);
};
$array_key_exists = function () use ($assocItems) {
array_key_exists(1111111, $assocItems);
};
testPerformance('in_array', $in_array, 100000);
testPerformance('isset', $isset, 100000);
testPerformance('array_key_exists', $array_key_exists, 100000);
Output:
Function call in_array took 5.01030 seconds
Function call isset took 0.00627 seconds
Function call array_key_exists took 0.00620 seconds
for($i = 0; $i < 100000; $i++)
{
//Generate machine id returns a 15 character alphanumeric string
$mid = Functions::generate_machine_id();
if (isset($machine_ids[$mid]))
{
die("Collision!");
}
$machine_ids[$mid] = true;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With