Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Issue with cloning and pass-by-reference

So for the past few days I have been tearing my hair out trying to get a class to clone properly. The problem is that cloning doesn't remove/redo any of the pass-by-reference. The result is, that the main data object is still passed as a reference, thus completely negating the effect of the clone.

Here's a simplified version of the problem:

class my_class {

    private $data;

    public $var1;
    public $var2;
    public $var3;


    public function __construct() {
        $this->data = new stdClass;

        $this->data->var1 = 'a';
        $this->data->var2 = 'b';
        $this->data->var3 = 'c';

        $this->var1 = &$this->data->var1;
        $this->var2 = &$this->data->var2;
        $this->var3 = &$this->data->var3;
    }
}


$original  = new my_class;
$new       = clone $original;
$new->var3 = 'd';

// Output Should Be "c", outputs "d"
echo $original->var3;

See it in action here: http://3v4l.org/nm6NW

My Question: How can I use __clone() to correct the output from "d" to "c"?

Please help in any way you can!?


Update

I ended up making __clone() trigger an error and creating a function called make_clone() to standardize cloning.

__clone() now looks like:

public function __clone() {
    $trace = debug_backtrace();
    $fmt = 'Invalid clone in <b>%4$s</b> on line <b>%5$s</b>. Because of how cloning works, and how references are configured within the class, extensions of %1$s cannot be cloned. Please use <code>make_clone()</code> instead. Error triggered';
    trigger_error(sprintf($fmt, $trace[0]['class'], $trace[0]['function'], 'clone', $trace[0]['file'], $trace[0]['line']), E_USER_NOTICE);
}

and make_clone() looks like:

public function make_clone() {
    // This line recreates the current instance at its' current state.
    $clone = new $this(generate::string($this->object));
    // In my class $this->input is a type of history state.
    // The history of both the original and the clone should match
    $clone->input = $this->input;
    return $clone;
}
like image 813
Nicholas Summers Avatar asked Aug 21 '14 07:08

Nicholas Summers


1 Answers

TL;DR

This is a classic case of PHP SNAFU. I will explain how and why it happens, but unfortunately as far as I can tell there is no possible solution that is satisfactory.

A brittle solution exists if you can run code before PHP shallow clones the object (e.g. by writing your own cloning method), but not if the code runs afterwards, which is how __clone works. However, this solution can still fail for other reasons outside your control.

There is also another option that is safe which involves a well-known "cloning" trick, but it also has drawbacks: it only works on data that is serializable, and it doesn't allow you to keep any references inside that data around even if you want to.

At the end of the day if you want to keep your sanity you will have to move away from implementing the properties $this->varN as references.

The plight of the poor PHP developer

Normally you would have to deep clone everything that needs to be cloned inside __clone. Then you would also have to reassign any references that still point to the instances that were just deep cloned.

You would think these two steps should be enough, for example:

public function __construct()
{
    $this->data = new stdClass;
    $this->data->var1 = 'a';
    $this->data->var2 = 'b';
    $this->data->var3 = 'c';
    $this->assignReferences();
}

public function __clone()
{
    $this->data = clone $this->data;
    $this->assignReferences();
}

private function assignReferences()
{
    $this->var1 = &$this->data->var1;
    $this->var2 = &$this->data->var2;
    $this->var3 = &$this->data->var3;        
}

However, this does not work. How can that be?

Zend engine references

If you var_dump($this->data) before and after assignReferences() in the constructor you will see that assigning those references causes the contents of $this->data to become references themselves.

This is an artifact of how references are internally implemented in PHP and there is nothing you can do about it directly. What you can do is convert them back to normal values first by losing all other references to them, after which cloning as above would work.

In code:

public function __construct()
{
    $this->data = new stdClass;
    $this->data->var1 = 'a';
    $this->data->var2 = 'b';
    $this->data->var3 = 'c';
    $this->assignReferences();
}

public function makeClone()
{
    unset($this->var1);  // turns $this->data->var1 into non-reference
    unset($this->var2);  // turns $this->data->var2 into non-reference
    unset($this->var3);  // turns $this->data->var3 into non-reference

    $clone = clone $this;               // this code is the same
    $clone->data = clone $clone->data;  // as what would go into
    $clone->assignReferences();         // __clone() normally

    $this->assignReferences(); // undo the unset()s
    return $clone;
}

private function assignReferences()
{
    $this->var1 = &$this->data->var1;
    $this->var2 = &$this->data->var2;
    $this->var3 = &$this->data->var3;        
}

This appears to work, but it's immediately not very satisfactory because you have to know that the way to clone this object is $obj->makeClone() instead of clone $obj -- the natural approach will fail.

However, there is also a more insidious bug here waiting to bite you: to un-reference the values inside $this->data you have to lose all references to them in the program. The code above does so for the references in $this->varN, but what about references other code might have created?

Compare this:

$original = new my_class;
$new = $original->makeClone();
$new->var3 = 'd'; 

echo $original->var3; // works, "c"

To this:

$original = new my_class;
$oops = &$original->var3; // did you think this might be a problem?
$new = $original->makeClone();
$new->var3 = 'd'; 

echo $original->var3; // doesn't work!

We are now back to square one. And worse, there is no way to prevent someone from doing this and messing up your program.

Kill the references with fire

There is a guaranteed way to make the references inside $this->data go away no matter what: serialization.

public function __construct()
{
    $this->data = new stdClass;
    $this->data->var1 = 'a';
    $this->data->var2 = 'b';
    $this->data->var3 = 'c';
    $this->assignReferences();
}

public function __clone()
{
    $this->data = unserialize(serialize($this->data)); // instead of clone
    $this->assignReferences();
}

This works with the values in question, but it also has drawbacks:

  1. You cannot have any values (recursively) inside $this->data that are not serializable.
  2. It will indiscriminately kill all references inside $this->data -- even those you might want to preserve on purpose.
  3. It's less performant (a theoretical point, to be fair).

So what to do?

After the obligatory bashing of PHP, follow the classic doctor's advice: if it hurts when you do something, then don't do it.

In this case this means that you just can't expose the contents of $this->data through public properties (references) on the object. Instead of this use getter functions or possibly implement the magic __get.

like image 163
Jon Avatar answered Oct 20 '22 13:10

Jon