Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

My PHP array of references is "magically" becoming an array of values... why?

I'm creating a wrapper function around mysqli so that my application needn't be overly complicated with database-handling code. Part of that is a bit of code to parameterize the SQL calls using mysqli::bind_param(). bind_param(), as you may know, requires references. Since it's a semi-general wrapper, I end up making this call:

call_user_func_array(array($stmt, 'bind_param'), $this->bindArgs);

and I get an error message:

Parameter 2 to mysqli_stmt::bind_param() expected to be a reference, value given

The above discussion is to forstall those who would say "You don't need references at all in your example".

My "real" code is a bit more complicated than anyone wants to read, so I've boiled the code leading up to this error into the following (hopefully) illustrative example:

class myclass {
  private $myarray = array();

  function setArray($vals) {
    foreach ($vals as $key => &$value) {
      $this->myarray[] =& $value;
    }
    $this->dumpArray();
  }
  function dumpArray() {
    var_dump($this->myarray);
  }
}

function myfunc($vals) {
  $obj = new myclass;
  $obj->setArray($vals);
  $obj->dumpArray();
}

myfunc(array('key1' => 'val1',
             'key2' => 'val2'));

The problem appears to be that, in myfunc(), in between the call to setArray() and the call to dumpArray(), all the elements in $obj->myarray stop being references and become just values instead. This can be easily seen by looking at the output:

array(2) {
  [0]=>
  &string(4) "val1"
  [1]=>
  &string(4) "val2"
}
array(2) {
  [0]=>
  string(4) "val1"
  [1]=>
  string(4) "val2"
}

Note that the array is in the "correct" state in the first half of the output here. If it made sense to do so, I could make my bind_param() call at that point, and it would work. Unfortunately, something breaks in the latter half of the output. Note the lack of the "&" on the array value types.

What happened to my references? How can I prevent this from happening? I hate to call "PHP bug" when I'm really not a language expert, but could this be one? It does seem very odd to me. I'm using PHP 5.3.8 for my testing at the moment.


Edit:

As more than one person pointed out, the fix is to change setArray() to accept its argument by reference:

function setArray(&$vals) {

I'm adding this note to document WHY this seems to work.

PHP generally, and mysqli in particular, appear to have a slightly odd concept of what a "reference" is. Observe this example:

$a = "foo";
$b = array(&$a);
$c = array(&$a);
var_dump($b);
var_dump($c);

First of all, I'm sure you're wondering why I'm using arrays instead of scalar variables -- it's because var_dump() doesn't show any indication of whether a scalar is a reference, but it does for array members.

Anyway, at this point, $b[0] and $c[0] are both references to $a. So far, so good. Now we throw our first wrench into the works:

unset($a);
var_dump($b);
var_dump($c);

$b[0] and $c[0] are both still references to the same thing. If we change one, both will still change. But what are they references to? Some unnamed location in memory. Of course, garbage collection insures that our data is safe, and will remain so, until we stop refering to it.

For our next trick, we do this:

unset($b);
var_dump($c);

Now $c[0] is the only remaining reference to our data. And, whoa! Magically, it's no longer a "reference". Not by var_dump()'s measure, and not by mysqli::bind_param()'s measure either.

The PHP manual says that there's a separate flag, 'is_ref' on every piece of data. However, this test appears to suggest that 'is_ref' is actually equivalent to '(refcount > 1)'

For fun, you can modify this toy example as follows:

$a = array("foo");
$b = array(&$a[0]);
$c = array(&$a[0]);

var_dump($a);
var_dump($b);
var_dump($c);

Note that all three arrays have the reference mark on their members, which backs up the idea that 'is_ref' is functionally equivalent to '(refcount > 1)'.

It's beyond me why mysqli::bind_param() would care about this distinction in the first place (or perhaps it's call_user_func_array()... either way), but it would appear that what we "really" need to ensure is that the reference count is at least 2 for each member of $this->bindArgs in our call_user_func_array() call (see the very beginning of the post/question). And the easiest way to do that (in this case) is to make setArray() pass-by-reference.


Edit:

For extra fun and games, I modified my original program (not shown here) to leave its equivalent to setArray() pass-by-value, and to create a gratuitous extra array, bindArgsCopy, containing exactly the same thing as bindArgs. Which means that, yes, both arrays contained references to "temporary" data which was deallocated by the time of the second call. As predicted by the analysis above, this worked. This demonstrates that the above analysis is not an artifact of var_dump()'s inner workings (a relief to me, at least), and it also demonstrates that it's the reference count that matters, not the "temporary-ness" of the original data storage.

So. I make the following assertion: In PHP, for the purpose of call_user_func_array() (and probably more), saying that a data item is a "reference" is the same thing as saying that the item's reference count is greater than or equal to 2 (ignoring PHP's internal memory optimizations for equal-valued scalars)


Administrivia note: I'd love to give mario the site credit for the answer, as he was the first to suggest the correct answer, but since he wrote it in a comment, not an actual answer, I couldn't do it :-(

like image 949
Rick Koshi Avatar asked Dec 01 '11 03:12

Rick Koshi


2 Answers

Adding the & in the argument signature fixed it for me. This means the function will receive the memory address of the original array.

function setArray(&$vals) {
// ...
}

CodePad.

like image 159
alex Avatar answered Oct 02 '22 08:10

alex


Pass the array as a reference:

  function setArray(&$vals) {
    foreach ($vals as $key => &$value) {
      $this->myarray[] =& $value;
    }
    $this->dumpArray();
  }

My guess (which could be wrong in some details, but is hopefully correct for the most part) as to why this makes your code work as expected is that when you pass as a value, everything's cool for the call to dumpArray() inside of setArray() because the reference to the $vals array inside setArray() still exist. But when control returns to myfunc() then that temporary variable is gone as are all references to it. So PHP dutifully changes the array to string values instead of references before deallocating the memory for it. But if you pass it as a reference from myfunc() then setArray() is using references to an array that lives on when control returns to myfunc().

like image 20
Trott Avatar answered Oct 02 '22 10:10

Trott