Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is in_array strict mode on integers slower than non-strict mode?

I always thought that in_array strict mode will be faster or at least the same speed as non-strict mode. But after some benchmarks I noticed there is a huge difference in execution time between them while searching for integers. The string and array tests indicate that strict mode is faster. Why?

Test code - (PHP 7.2.1):

<?php

$array = array_fill(0, 10000, 12345);

for ($i=0; $i<100000; $i++) {
    in_array($i, $array, true);
}

time php test.php

php -c test.php 12.98s user 0.04s system 98% cpu 13.234 total


<?php

$array = array_fill(0, 10000, 12345);

for ($i=0; $i<100000; $i++) {
    in_array($i, $array, false);
}

time php test.php

php -c test.php 6.44s user 0.04s system 99% cpu 6.522 total

like image 693
Tom Avatar asked Jul 03 '19 12:07

Tom


2 Answers

I can offer some small insight from tracing through the C source for in_array.

It turns out, when comparing integers, the path to reach the actual equality check for non-strict mode involves fewer operations than strict mode.

Strict mode

In the case where the strict flag to in_array is true, the following occurs:

  1. We call fast_is_identical_function for each element in the array

  2. The fast_is_identical_function first tests that the types of each operand are different (Z_TYPE_P(op1) != Z_TYPE_P(op2)) in hopes of being able to return false early; this is comparison #1.

  3. If the types are the same (they are, in your test case), we then test (Z_TYPE_P(op1) <= IS_TRUE; I've no idea what this does, but it's comparison #2.

  4. After both comparisons have evaluated to false, we jump into zend_is_identical, our first function invocation.

  5. zend_is_identical starts out by again testing Z_TYPE_P(op1) != Z_TYPE_P(op2), another attempt to fail early. This is comparison #3.

  6. If the types are the same, we can descend through the switch (Z_TYPE_P(op1)) statement, comparison #4

  7. Finally we reach the comparison Z_LVAL_P(op1) == Z_LVAL_P(op2) which actually tests the equality of the two values, comparison #5.

In total, to test whether each element of the array is equal to the value we're searching for, there are 5 comparisons and 1 function invocation.

Non-strict mode

By comparison, the non-strict flow for integers specifically (really LONGs) is much simpler, as follows:

  1. Instead of fast_is_identical_function, we instead use fast_equal_check_function for each element in the array.

  2. The method fast_equal_check_function starts a much more complicated process of comparing the two values with all kinds of type-casting logic. However, the very first test it does happens to be optimized for integers, as follows:

    if (EXPECTED(Z_TYPE_P(op1) == IS_LONG)) {
        if (EXPECTED(Z_TYPE_P(op2) == IS_LONG)) {
            return Z_LVAL_P(op1) == Z_LVAL_P(op2);

    We can see that it...

    1. immediately tests whether type of op1 is LONG, which it is, then
    2. immediately tests whether the type of op2 is LONG, which it is, then
    3. immediately returns the result of Z_LVAL_P(op1) == Z_LVAL_P(op2)

A total of 3 simple equality comparisons and 0 function invocations for the non-strict case, vs at least 5 comparisons and 1 jump for the strict case.

This appears to be a case where attempted early optimization makes the strict check slower (by repeatedly testing the types of the operands in hopes that we can find inequality sooner) than the specific non-strict case of comparing two integers.

like image 113
meagar Avatar answered Nov 14 '22 18:11

meagar


It seems to have something to do with the type of element in the needle and/or haystack, observe:

PHP 7.3.5 from http://sandbox.onlinephpfunctions.com/

$iterations = 10000000;
$needle = false;
$haystack = [ true ];

$start = microtime( true );
for( $i = 0; $i < $iterations; ++$i )
{
    in_array( $needle, $haystack, true );
}
echo ( microtime( true ) - $start ).' strict'.PHP_EOL;

$start = microtime( true );
for( $i = 0; $i < $iterations; ++$i )
{
    in_array( $needle, $haystack, false );
}
echo ( microtime( true ) - $start ).' not strict';

produces:

0.29996585845947 strict
0.40397191047668 not strict

but if we use:

$needle = 1;
$haystack = [ 2 ];

then we get:

0.34480714797974 strict
0.28275084495544 not strict

However, PHP 5.6.29 produces a negligible discrepancy and running the same test multiple times can put strict ahead of non-strict and vice versa.

like image 30
MonkeyZeus Avatar answered Nov 14 '22 16:11

MonkeyZeus