Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assign by reference bug

Tags:

php

opcodes

I came across this seemingly very simple question the other day How to changing value in $array2 without referring $array1? However the more i looked into it the more odd it seemed that this is indeed functioning as intended. After this I started looking into the opcodes that are generated from the output of the following.

$array1 = array(2, 10);
$x = &$array1[1];
$array2 = $array1;
$array2[1] = 22;

echo $array1[1]; // Outputs 22

This seems crazy to me since array2 should only be a copy of array1 and anything that happens to one array should not effect the contents of the other. Of course if you comment out the second line the final line will echo out 10 like expected.

Looking farther I could a cool site that shows me the opcodes that PHP produces using the Vulcan Logic Dumper. Here is the opcodes generated by the above code.

Finding entry points
Branch analysis from position: 0
Return found
filename:       /in/qO86H
function name:  (null)
number of ops:  11
compiled vars:  !0 = $array1, !1 = $x, !2 = $array2
line     # *  op                           fetch          ext  return  operands
---------------------------------------------------------------------------------
   3     0  >   INIT_ARRAY                                       ~0      2
         1      ADD_ARRAY_ELEMENT                                ~0      10
         2      ASSIGN                                                   !0, ~0
   4     3      FETCH_DIM_W                                      $2      !0, 1
         4      ASSIGN_REF                                               !1, $2
   5     5      ASSIGN                                                   !2, !0
   6     6      ASSIGN_DIM                                               !2, 1
         7      OP_DATA                                                  22, $6
   8     8      FETCH_DIM_R                                      $7      !0, 1
         9      ECHO                                                     $7
        10    > RETURN                                                   1

These opcodes aren't documented great here http://php.net/manual/en/internals2.opcodes.php but I believe in English the opcodes are doing the following. By line... might be more for me than anyone else.

  1. Line 3: We initialize the array with it's first value and then add 10 to it before assigning it to $array1.
  2. Line 4: Get a write-only? value from the array and assign it by reference to $x.
  3. Line 5: Set $array1 to $array2.
  4. Line 6: Get array index of 1. od_data I am guessing sets it to 22 although $6 is never returned. OD_DATA has absolutely no documentation and is not listed as an opcode anywhere I have looked.
  5. Line 8: Fetch a read only value from index 1 of $array1 and echo it out.

Even working through the opcodes I am not sure where this is going wrong. I have a feeling the lack of documentation on the opcodes and my inexperience with working with them is likely keeping me from figuring out where this is going wrong.

EDIT 1:

As pointed out by Mike in the first comment arrays reference status is preserved when they are copied. Here can be seen documentation along with a place in the array article it links to http://php.net/manual/en/language.types.array.php#104064. This funny enough is not considered a warning. What is surprising to me if this is true the reference status is not preserved for this code as you would expect.

$array1 = array(2, 10);
$x = &$array1;
$array2 = $array1;
$array2[1] = 22;

echo $array1[1]; // Output is 10

So it seems this only happens when you try and assign single elements by reference making this functionality even more confusing.

Why does php only preserve the status of the arrays indexes when they are individually assigned?

EDIT 2:

I did some testing using HHVM today and HHVM handles the first snip-it of code how you think it would. I love PHP but HHVM is looking better and better over the Zend Engine.

like image 283
eatingthenight Avatar asked Sep 28 '14 02:09

eatingthenight


1 Answers

This is explained over at the PHP manual (even if you have to spend more time than you should have to in order to find it), specifically over at http://php.net/manual/en/language.types.array.php#104064

The "shared" data stays shared, with the initial assignment just acting as an alias. It's not until you start manipulating the arrays with independent operations like ...[] = ... that the intepreter starts to treat them as divergent lists, and even then the shared data stays shared so you can have two arrays with a shared first n elements but divergent subsequent data.

For a true "copy by value" for one array to another, you pretty much end up doing something like

$arr2 = array();
foreach($arr1 as $val) {
  $arr2[] = $val;
}

or

$arr2 = array();
for($i=count($arr1)-1; $i>-1; $i--) {
  $arr2[$i] = $arr[$i];
}

(using reverse looping mostly because not enough people remember that's a thing you can do, and is more efficient than a forward loop =)

You'd imagine there'd be an array_copy function or something to help deal with the array copy quirk, but there just doesn't seem to be one. It's odd, but one of those "the state of PHP" things. A choice was made in the past, PHP's lived with that choice for quite a few years as a result, so it's just "one of those things". Unfortunately!

like image 56
Mike 'Pomax' Kamermans Avatar answered Oct 03 '22 00:10

Mike 'Pomax' Kamermans