I'm trying to figure out how does PHP load arrays to memory and when does passing an array consume memory.
So I’ve got this little bit of code running: note that the input array is less important in this example:
<?php
echo $this->getMemoryUsage();
$arr = $query->result_array(); // array of arrays from codeigniter
echo $this->getMemoryUsage();
This consumes exactly 250 kB of memory, this means the array is roughly 250 kB in size, roughly.
So I ran the following code:
<?php
echo $this->getMemoryUsage();
$arr = $query->result_array(); // array of arrays from codeigniter
$arr[0]['id'] = 'changing this value';
$foo = $arr;
$foo[2]['id'] = 'changing this value again';
$bar = $foo;
$bar[4]['id'] = 'changing this value again and again';
$far = $bar;
$far[5]['id'] = 'changing this value again and again and again';
echo $this->getMemoryUsage();
According to what I read and was told, PHP doesn’t actually copy the array, it only references the original array, but once a change is made PHP has to copy the entire array.
Imagine my surprise when the above code consumes exactly 500 kB of RAM.
Can anyone explain what’s going on here?
Just to be clear, all these indices (0–5 and id
) already exist in the original array, I’m just modifying the value. The original value is some integer.
EDIT
Just to clear the involvement of $this->result(); Here's another test I've conducted :
echo $this->getMemoryUsage();
$arr = $query->result_array(); // array of arrays from codeigniter
//$arr[0]['id'] = 'changing this value';
$foo = $arr;
$foo[2]['id'] = 'changing this value again';
//$bar = $foo;
//$bar[4]['id'] = 'changing this value again and again';
//
//$far = $bar;
//$far[4]['id'] = 'changing this value again and again and again';
echo $this->getMemoryUsage();
This time the output is exactly 250 kB - Just like the original trial without any changes
EDIT #2
As requested, I've ran the code from here on my setup, to make sure results are consistent : http://pastebin.com/cYNg4cg7
These are the results :
DECLARATION: 4608 kB
FINAL: 8904 kB
DIFF TO DECLARATION: 4296 kB
So even though the declaration was 4608 and the array was passed and changed 4 times, it's still only less than doubled the memory footprint.
EDIT #3
I've ran the memory changes after each allocation :
DECLARATION: 5144 kB
allocating A0 added : 144 kB
allocating A1 added : 1768 kB
allocating A2 added : 1768 kB
allocating A3 added : 1768 kB
FINAL: 10744 kB
DIFF TO DECLARATION: 5600 kB
Each following operation after the first costs exactly the same, which seems to indicate the exact same size is being copied. This seems to support Austin's answer, The only thing that doesn't add up now is the size that's allocated, But that's a different question .
Seems like Austin's on the ball, I'll accept it if no other answer comes by.
Here's what I think is going on:
PHP arrays are copy on write as you say, but each level of a multi-dimensional array is separately copy on write. PHP is very smart about reusing parts of a multi-dimensional array and not just the whole thing. (This is similar to some file systems that support snapshots, like ZFS.)
Example: say we have this array
$x = array('foo' => array(1, 2, 3), 'bar' => array(4, 5, 6));
This is stored in memory not as a single chunk, but as separate chunks here labeled A
, B
, C
, and $x
:
array(1, 2, 3) //A
array(4, 5, 6) //B
array('foo' => {pointer to A}, 'bar' => {pointer to B}) //C
{pointer to C} //$x
Now lets make a copy of $x
:
$y = $x;
This uses very little extra memory, because all it has to do is create another pointer to C
:
array(1, 2, 3) //A
array(4, 5, 6) //B
array('foo' => {pointer to A}, 'bar' => {pointer to B}) //C
{pointer to C} //$x
{pointer to C} //$y
Now lets change $y
:
$y['foo'][0] = 10;
Here's what DOESN'T happen:
array(1, 2, 3) //A
array(10, 2, 3) //A2
array(4, 5, 6) //B
array(4, 5, 6) //B2
array('foo' => {pointer to A}, 'bar' => {pointer to B}) //C
array('foo' => {pointer to A2}, 'bar' => {pointer to B2}) //C2
{pointer to C} //$x
{pointer to C2} //$y
Notice that B
and B2
are identical. There's no need to keep the same thing twice, so what actually happens is this:
array(1, 2, 3) //A
array(10, 2, 3) //A2
array(4, 5, 6) //B
array('foo' => {pointer to A}, 'bar' => {pointer to B}) //C
array('foo' => {pointer to A2}, 'bar' => {pointer to B}) //C2
{pointer to C} //$x
{pointer to C2} //$y
In this simple case, the benefit is pretty small, but imagine that instead of three numbers, the 'bar'
array contained thousands of numbers. You end up saving huge amounts of memory.
Relating this to your original code, try printing out the memory usage not only at the start and the end, but after every new array assignment. You'll see that the memory usage increases by only a fraction of what the original array takes up after each step. This is because only part of the array is being copied, not the whole thing. Specifically, the first-level array and the specific sub array you change get copied, but the other sub arrays do not get copied.
The fact that the final amount of memory used is twice as much as the starting amount seems to be a coincidence due to the particular setup of your code and the number of copies of the array you make.
(In reality, PHP can do even better than what I describe here (it will probably keep only one copy of 'foo'
and 'bar'
, etc.), but for the most part it boils down to the same sort of trick.)
If you want a more dramatic demonstration of this, do something like this:
$base = memory_get_usage();
$x = array('small' => array('this is small'), 'big' => array());
for ($i = 0; $i < 1000000; $i++) {
$x['big'][] = $i;
}
echo (memory_get_usage() - $base).PHP_EOL; //a lot of memory
$y = $x;
$y['small'][0] = 'now a bit bigger';
echo (memory_get_usage() - $base).PHP_EOL; //a bit more memory
$z = $x;
$z['big'][0] = 2;
echo (memory_get_usage() - $base).PHP_EOL; //a LOT more memory
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With