Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does PHP’s array memory usage management work?

Tags:

php

memory

I'm trying to figure out how does PHP load arrays to memory and when does passing an array consume memory.

So I’ve got this little bit of code running: note that the input array is less important in this example:

<?php

echo $this->getMemoryUsage();
$arr = $query->result_array(); // array of arrays from codeigniter
echo $this->getMemoryUsage();

This consumes exactly 250 kB of memory, this means the array is roughly 250 kB in size, roughly.

So I ran the following code:

<?php

echo $this->getMemoryUsage();
$arr = $query->result_array(); // array of arrays from codeigniter

$arr[0]['id'] = 'changing this value';

$foo = $arr;
$foo[2]['id'] = 'changing this value again';

$bar = $foo;
$bar[4]['id'] = 'changing this value again and again';

$far = $bar;
$far[5]['id'] = 'changing this value again and again and again';

echo $this->getMemoryUsage();

According to what I read and was told, PHP doesn’t actually copy the array, it only references the original array, but once a change is made PHP has to copy the entire array.

Imagine my surprise when the above code consumes exactly 500 kB of RAM.

Can anyone explain what’s going on here?

Just to be clear, all these indices (0–5 and id) already exist in the original array, I’m just modifying the value. The original value is some integer.

EDIT

Just to clear the involvement of $this->result(); Here's another test I've conducted :

    echo $this->getMemoryUsage();
    $arr = $query->result_array(); // array of arrays from codeigniter
//$arr[0]['id'] = 'changing this value';

    $foo = $arr;
    $foo[2]['id'] = 'changing this value again';

    //$bar = $foo;
    //$bar[4]['id'] = 'changing this value again and again';
    //
    //$far = $bar;
    //$far[4]['id'] = 'changing this value again and again and again';

    echo $this->getMemoryUsage();

This time the output is exactly 250 kB - Just like the original trial without any changes

EDIT #2

As requested, I've ran the code from here on my setup, to make sure results are consistent : http://pastebin.com/cYNg4cg7

These are the results :

DECLARATION: 4608 kB
FINAL: 8904 kB
DIFF TO DECLARATION: 4296 kB

So even though the declaration was 4608 and the array was passed and changed 4 times, it's still only less than doubled the memory footprint.

EDIT #3

I've ran the memory changes after each allocation :

DECLARATION: 5144 kB
allocating A0 added : 144 kB
allocating A1 added : 1768 kB
allocating A2 added : 1768 kB
allocating A3 added : 1768 kB
FINAL: 10744 kB
DIFF TO DECLARATION: 5600 kB

Each following operation after the first costs exactly the same, which seems to indicate the exact same size is being copied. This seems to support Austin's answer, The only thing that doesn't add up now is the size that's allocated, But that's a different question .

Seems like Austin's on the ball, I'll accept it if no other answer comes by.

like image 934
Patrick Avatar asked Jan 11 '15 18:01

Patrick


1 Answers

Here's what I think is going on:

PHP arrays are copy on write as you say, but each level of a multi-dimensional array is separately copy on write. PHP is very smart about reusing parts of a multi-dimensional array and not just the whole thing. (This is similar to some file systems that support snapshots, like ZFS.)

Example: say we have this array

$x = array('foo' => array(1, 2, 3), 'bar' => array(4, 5, 6));

This is stored in memory not as a single chunk, but as separate chunks here labeled A, B, C, and $x:

array(1, 2, 3) //A
array(4, 5, 6) //B
array('foo' => {pointer to A}, 'bar' => {pointer to B}) //C
{pointer to C} //$x

Now lets make a copy of $x:

$y = $x;

This uses very little extra memory, because all it has to do is create another pointer to C:

array(1, 2, 3) //A
array(4, 5, 6) //B
array('foo' => {pointer to A}, 'bar' => {pointer to B}) //C
{pointer to C} //$x
{pointer to C} //$y

Now lets change $y:

$y['foo'][0] = 10;

Here's what DOESN'T happen:

array(1, 2, 3) //A
array(10, 2, 3) //A2
array(4, 5, 6) //B
array(4, 5, 6) //B2
array('foo' => {pointer to A}, 'bar' => {pointer to B}) //C
array('foo' => {pointer to A2}, 'bar' => {pointer to B2}) //C2
{pointer to C} //$x
{pointer to C2} //$y

Notice that B and B2 are identical. There's no need to keep the same thing twice, so what actually happens is this:

array(1, 2, 3) //A
array(10, 2, 3) //A2
array(4, 5, 6) //B
array('foo' => {pointer to A}, 'bar' => {pointer to B}) //C
array('foo' => {pointer to A2}, 'bar' => {pointer to B}) //C2
{pointer to C} //$x
{pointer to C2} //$y

In this simple case, the benefit is pretty small, but imagine that instead of three numbers, the 'bar' array contained thousands of numbers. You end up saving huge amounts of memory.

Relating this to your original code, try printing out the memory usage not only at the start and the end, but after every new array assignment. You'll see that the memory usage increases by only a fraction of what the original array takes up after each step. This is because only part of the array is being copied, not the whole thing. Specifically, the first-level array and the specific sub array you change get copied, but the other sub arrays do not get copied.

The fact that the final amount of memory used is twice as much as the starting amount seems to be a coincidence due to the particular setup of your code and the number of copies of the array you make.

(In reality, PHP can do even better than what I describe here (it will probably keep only one copy of 'foo' and 'bar', etc.), but for the most part it boils down to the same sort of trick.)

If you want a more dramatic demonstration of this, do something like this:

$base = memory_get_usage();
$x = array('small' => array('this is small'), 'big' => array());
for ($i = 0; $i < 1000000; $i++) {
    $x['big'][] = $i;
}
echo (memory_get_usage() - $base).PHP_EOL; //a lot of memory
$y = $x;
$y['small'][0] = 'now a bit bigger';
echo (memory_get_usage() - $base).PHP_EOL; //a bit more memory
$z = $x;
$z['big'][0] = 2;
echo (memory_get_usage() - $base).PHP_EOL; //a LOT more memory
like image 105
Austin Avatar answered Oct 12 '22 09:10

Austin