Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does PHP handle an array with a reference to itself as an element?

Lets say I declare an array:

$data = array( 'foo' => 'bar' );

Now I'll add a reference to itself as a new element:

$data['baz'] = &$data;

Dumping the contents of $data will result in:

Array
(
[foo] => bar
[baz] => Array
    (
        [foo] => bar
        [baz] => Array
         *RECURSION*
    )

)

Now, I can dump the contents of $data['baz']['baz']['baz']['baz']['baz']['baz']['baz']['baz']['baz'] and the result will be exactly the same as the above because the array has a pointer to itself as an element.

What I'd like to know is if php handles the array as one single set of data with a pointer that is exactly the same pointer that I'd call upon when using $data or if it does something completely different.

Also, can PHP run out of memory trying while returning the contents of $data{['baz']*n}?

like image 419
halfpastfour.am Avatar asked Oct 22 '13 12:10

halfpastfour.am


1 Answers

Internally in PHP everything is stored inside a variant container called a ZVAL. $data is represented by a ZVAL, each key and each value inside $data is a ZVAL, etc.

So after the initial assignment, from PHP has created three ZVALs:

   /-------------------\      /-------------------\   
   | ZVAL #1           |  /==>| ZVAL #2           |   
   |   type: array     |  |   |   type: string    |   
   |   data: [         |  |   |   data: "foo"     |
   |      {            |  |   \-------------------/
   |          key: =======/                         /-------------------\
   |          val: ================================>| ZVAL #3           |
   |      }            |                            |   type: string    |
   |   ]               |                            |   data:  "bar"    |
   \-------------------/                            \-------------------/

Note: the internal representation of an array items does not correspond to what is shown above; I did not want to burden the answer with unnecessary detail. The representation of a ZVAL is also shown simplified for the same reason. If you want to learn more about PHP internals, please read the source and/or this.

You can see that the fact that "foo" and "bar" are being used as an array key/value pair is not possible to determine by looking at their ZVALs: you have to know that they are being referred to by the array.

After the assignment $data['baz'] = &$data, what happens is you now have a circular reference: somewhere inside ZVAL #1 there is a pointer back to ZVAL #1:

   /-------------------\      /-------------------\   
   | ZVAL #1           |  /==>| ZVAL #2           |   
/=>|   type: array     |  |   |   type: string    |   
|  |   data: [         |  |   |   data: "foo"     |
|  |      {            |  |   \-------------------/
|  |          key: =======/                         /-------------------\
|  |          val: ================================>| ZVAL #3           |
|  |      },           |                            |   type: string    |
|  |      {            |                            |   data: "bar"     |
|  |          key: =========================\       \-------------------/
|  |          val: =========\               |       
|  |      }            |    |               |       /-------------------\
|  |   ]               |    |               \======>| ZVAL #4           |
|  \-------------------/    |                       |   type: string    |
|                           |                       |   data: "baz"     |
\===========================/                       \-------------------/

So how does PHP resolve $data['baz']['baz']? It knows that $data is represented by ZVAL#1, and it sees that you are trying to index into it with array syntax. It looks at the ZVAL, sees that it's an array, finds the item having the key "baz" and gets the ZVAL that represents it. What do you know? That's ZVAL#1 once more. This concludes the resolution of $data['baz'].

At the next step, it sees that you are trying to index into $data['baz'] as an array. It knows that $data['baz'] is represented by ZVAL#1 so the same thing ends up happening again, and so on.

You will have noted that the process above does not involve storing any intermediate results (the first and second step are totally independent), which means that there is no resource limit to be hit by the PHP virtual machine when trying to resolve the array access.

like image 162
Jon Avatar answered Oct 05 '22 07:10

Jon