Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP object array's not linearly scale while global arrays do?

There is a major performance issue when using in-object array's as a property versus using a global php array variable, why?

To benchmark this problem I created the following benchmark that stores an increasingly larger array with an stdClass as a node, two tests were run one using an array property in a class the other a global array.

The test code

ini_set('memory_limit', '2250M');
class MyTest {
    public $storage = [];
    public function push(){
        $this->storage[] = [new stdClass()];
    }
}

echo "Testing Objects".PHP_EOL;
for($size = 1000; $size < 5000000; $size *= 2) {
    $start = milliseconds();
    for ($a=new MyTest(), $i=0;$i<$size;$i++) {
        $a->push();
    }
    $end = milliseconds();
    echo "Array Size $size".PHP_EOL;
    echo $end - $start . " milliseconds to perform".PHP_EOL;
}
echo "================".PHP_EOL;
echo "Testing Array".PHP_EOL;
for($size = 1000; $size < 5000000; $size *= 2) {
    $start = milliseconds();
    for ($a=[], $i=0;$i<$size;$i++) {
        $a[] = [new stdClass()];
    }
    $end = milliseconds();
    echo "Array Size $size".PHP_EOL;
    echo $end - $start . " milliseconds to perform".PHP_EOL;
}

And the shocking results:

Testing Objects
Array Size 1000
2 milliseconds to perform
Array Size 2000
3 milliseconds to perform
Array Size 4000
6 milliseconds to perform
Array Size 8000
12 milliseconds to perform
Array Size 16000
35 milliseconds to perform
Array Size 32000
97 milliseconds to perform
Array Size 64000
246 milliseconds to perform
Array Size 128000
677 milliseconds to perform
Array Size 256000
2271 milliseconds to perform
Array Size 512000
9244 milliseconds to perform
Array Size 1024000
31186 milliseconds to perform
Array Size 2048000
116123 milliseconds to perform
Array Size 4096000
495588 milliseconds to perform
================
Testing Array
Array Size 1000
1 milliseconds to perform
Array Size 2000
2 milliseconds to perform
Array Size 4000
4 milliseconds to perform
Array Size 8000
8 milliseconds to perform
Array Size 16000
28 milliseconds to perform
Array Size 32000
61 milliseconds to perform
Array Size 64000
114 milliseconds to perform
Array Size 128000
245 milliseconds to perform
Array Size 256000
494 milliseconds to perform
Array Size 512000
970 milliseconds to perform
Array Size 1024000
2003 milliseconds to perform
Array Size 2048000
4241 milliseconds to perform
Array Size 4096000
14260 milliseconds to perform

Now besides the obvious overhead of the object calls itself the object array property scales terribly sometimes taking 3 - 4 times longer when the array becomes larger but this is not the case with the standard global array variable.

Any thoughts or answers regarding this problem and is this a possible bug with the PHP engine?

like image 904
Nick Avatar asked May 23 '12 14:05

Nick


2 Answers

I tested your code on PHP 5.3.9. To do so I had to translate [] to array(), and I also had to correct your line #12: from $a=new MyTest($size), to $mytest=new MyTest($size) (BTW, the constructor argument gets silently ignored, funny). I also added this code:

echo "================".PHP_EOL;
echo "Testing Function".PHP_EOL;
for($size = 1000; $size < 1000000; $size *= 2) {
    $start = milliseconds();
    for ($a=array(), $i=0;$i<$size;$i++) {
        my_push($a);
    }
    $end = milliseconds();
    echo "Array Size $size".PHP_EOL;
    echo $end - $start . " milliseconds to perform".PHP_EOL;
    echo "memory usage: ".memory_get_usage()." , real: ".memory_get_usage(true).PHP_EOL;
}

function my_push(&$a)
{
   $a[] = array(new stdClass());
}

I added the memory usage line to your loops at the same point, added an unset($mytest); after the object case (to get a more consistent memory log), and also replaced your 5000000's with 1000000's because I only have 2GB of RAM. This is what I got:

Testing Objects
Array Size 1000
2 milliseconds to perform
memory usage: 1666376 , real: 1835008
Array Size 2000
5 milliseconds to perform
memory usage: 2063280 , real: 2097152
Array Size 4000
10 milliseconds to perform
memory usage: 2857008 , real: 2883584
Array Size 8000
19 milliseconds to perform
memory usage: 4444456 , real: 4718592
Array Size 16000
44 milliseconds to perform
memory usage: 7619392 , real: 8126464
Array Size 32000
103 milliseconds to perform
memory usage: 13969256 , real: 14417920
Array Size 64000
239 milliseconds to perform
memory usage: 26668936 , real: 27262976
Array Size 128000
588 milliseconds to perform
memory usage: 52068368 , real: 52690944
Array Size 256000
1714 milliseconds to perform
memory usage: 102867104 , real: 103546880
Array Size 512000
5452 milliseconds to perform
memory usage: 204464624 , real: 205258752
================
Testing Array
Array Size 1000
1 milliseconds to perform
memory usage: 18410640 , real: 20709376
Array Size 2000
4 milliseconds to perform
memory usage: 18774760 , real: 20709376
Array Size 4000
7 milliseconds to perform
memory usage: 19502976 , real: 20709376
Array Size 8000
13 milliseconds to perform
memory usage: 20959360 , real: 21233664
Array Size 16000
29 milliseconds to perform
memory usage: 23872176 , real: 24379392
Array Size 32000
61 milliseconds to perform
memory usage: 29697720 , real: 30146560
Array Size 64000
124 milliseconds to perform
memory usage: 41348856 , real: 41943040
Array Size 128000
280 milliseconds to perform
memory usage: 64651088 , real: 65273856
Array Size 256000
534 milliseconds to perform
memory usage: 111255536 , real: 111935488
Array Size 512000
1085 milliseconds to perform
memory usage: 204464464 , real: 205258752
================
Testing Function
Array Size 1000
357 milliseconds to perform
memory usage: 18410696 , real: 22544384
Array Size 2000
4 milliseconds to perform
memory usage: 18774768 , real: 22544384
Array Size 4000
9 milliseconds to perform
memory usage: 19503008 , real: 22544384
Array Size 8000
17 milliseconds to perform
memory usage: 20959392 , real: 22544384
Array Size 16000
36 milliseconds to perform
memory usage: 23872208 , real: 24379392
Array Size 32000
89 milliseconds to perform
memory usage: 29697720 , real: 30146560
Array Size 64000
224 milliseconds to perform
memory usage: 41348888 , real: 41943040
Array Size 128000
529 milliseconds to perform
memory usage: 64651088 , real: 65273856
Array Size 256000
1587 milliseconds to perform
memory usage: 111255616 , real: 111935488
Array Size 512000
5244 milliseconds to perform
memory usage: 204464512 , real: 205258752

As you can see, appending to the array inside a function call costs almost as much as (and has the same non-linear behavior as) doing it inside your original method call. One thing can be said for sure:

It's the function calls that eat up CPU time!

Regarding the non-linear behavior, it becomes really evident only above a certain threshold. While all three cases have the same memory behavior (because of incomplete gargabe collection this is only evident among the "plain array" and the "array inside function" case, in this log), it is the "array inside method" and the "array inside function" cases that have the same execution time behavior. This means that it's the function calls themselves that cause a non-linear increase in time. It seems to me that this can be said:

The amount of data that is around during a function call influences its duration.

To verify this I replaced all $a[] with $a[0] and all 1000000 with 5000000 (to get similar total execution times) and obtained this output:

Testing Objects
Array Size 1000
2 milliseconds to perform
memory usage: 1302672 , real: 1572864
Array Size 2000
4 milliseconds to perform
memory usage: 1302672 , real: 1572864
Array Size 4000
8 milliseconds to perform
memory usage: 1302672 , real: 1572864
Array Size 8000
15 milliseconds to perform
memory usage: 1302672 , real: 1572864
Array Size 16000
31 milliseconds to perform
memory usage: 1302672 , real: 1572864
Array Size 32000
62 milliseconds to perform
memory usage: 1302672 , real: 1572864
Array Size 64000
123 milliseconds to perform
memory usage: 1302672 , real: 1572864
Array Size 128000
246 milliseconds to perform
memory usage: 1302672 , real: 1572864
Array Size 256000
493 milliseconds to perform
memory usage: 1302672 , real: 1572864
Array Size 512000
985 milliseconds to perform
memory usage: 1302672 , real: 1572864
Array Size 1024000
1978 milliseconds to perform
memory usage: 1302672 , real: 1572864
Array Size 2048000
3965 milliseconds to perform
memory usage: 1302672 , real: 1572864
Array Size 4096000
7905 milliseconds to perform
memory usage: 1302672 , real: 1572864
================
Testing Array
Array Size 1000
1 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 2000
3 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 4000
5 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 8000
10 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 16000
20 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 32000
40 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 64000
80 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 128000
161 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 256000
322 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 512000
646 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 1024000
1285 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 2048000
2574 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 4096000
5142 milliseconds to perform
memory usage: 1302464 , real: 1572864
================
Testing Function
Array Size 1000
1 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 2000
4 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 4000
6 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 8000
14 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 16000
26 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 32000
53 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 64000
105 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 128000
212 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 256000
422 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 512000
844 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 1024000
1688 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 2048000
3377 milliseconds to perform
memory usage: 1302464 , real: 1572864
Array Size 4096000
6814 milliseconds to perform
memory usage: 1302464 , real: 1572864

Note how the times are almost perfectly linear now. Of course, the array size is stuck to 1 now. Note also how the differences of the execution times of the three cases are less pronounced than before. Remember that the innermost operation is the same in all cases.

I'm not going to try to fully explain all this (gargabe collection on function call? memory fragmentation? ...?), but I think that I have nonetheless collected some useful information, for everyone here and for myself too.

like image 127
Walter Tross Avatar answered Oct 07 '22 13:10

Walter Tross


I can't post this all in a comment, so this is more of an observation than an answer. It looks like SplObjectStorage is fairly slow. Also that array_push is a lot faster than $array[] = 'item';

Disclaimer: Apologies for the sloppy code :)

<?php

$time = microtime();
$time = explode(' ', $time);
$time = $time[1] + $time[0];
$start = $time;

$iteration = 10000;

switch ($_REQUEST['test'])
{
    case 1:
        $s = new SplObjectStorage();

        for ($i = 0; $i < $iteration; $i++) {
            $obj = new stdClass;
            $s[$obj] = 'test';
        }
        break;
    case 2:

        $s = array();
        for ($i = 0; $i < $iteration; $i++) {
            $obj = new stdClass;
            $s[$i] = $obj;
        }
        break;

    case 3:
        class Test {
            public $data = array();
        }
        $s = new Test;
        for ($i = 0; $i < $iteration; $i++) {
            $obj = new stdClass;
            $s->data[] = $obj;
        }
        break;

    case 4:
        class Test {
            public static $data = array();
        }
        $s = new Test;
        for ($i = 0; $i < $iteration; $i++) {
            $obj = new stdClass;
            $s->data[] = $obj;
        }
        break;  
    case 5:
        class Test {
            public $data = array();
        }
        $s = new Test;
        for ($i = 0; $i < $iteration; $i++) {
            $obj = new stdClass;
            array_push($s->data, $obj);
        }
        break;  
    default:
        echo 'Type in ?test=#';
}

$time = microtime();
$time = explode(' ', $time);
$time = $time[1] + $time[0];
$finish = $time;
$total_time = round(($finish - $start), 6);
echo 'Page generated in '.$total_time.' seconds.';
like image 37
JREAM Avatar answered Oct 07 '22 13:10

JREAM