Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP registerNodeClass and reusing variable names

Tags:

dom

php

When registering a new base node type with registerNodeClass: it looks like if I re-use variable names for created elements, then custom properties revert to their default value. I'm actually trying to do this in a loop, but here's an example that I think shows clearly what I mean:

<?php

class myDOMElement extends DOMElement
{
    public $myProp = 'Some default';
}

$doc = new DOMDocument();
$doc->registerNodeClass('DOMElement', 'myDOMElement');

$node = $doc->createElement('a');
$node->myProp = 'A';
$doc->appendChild($node);

# This seems to alter node A in $doc, not what I expected:
$node = $doc->createElement('b');
$node->myProp = 'B';
$doc->appendChild($node);

# Note: $nodeC instead of $node, this works fine. 
$nodeC = $doc->createElement('c');
$nodeC->myProp = 'C';
$doc->appendChild($nodeC);

foreach ($doc->childNodes as $n) {
    echo 'Tag ', $n->tagName, ' myProp:', PHP_EOL;
    var_dump($n->myProp);
}

Why do I get "Some default" for tag a instead of the value "A"?

Tag a myProp:
string(12) "Some default"
Tag b myProp:
string(1) "B"
Tag c myProp:
string(1) "C"
like image 452
Jackson Pauls Avatar asked Feb 26 '16 14:02

Jackson Pauls


1 Answers

Let's assume we work with PHP7(the described behavior is peculiar to PHP versions 5..7, at least).

The DOMNode::appendChild method sets up internal structures of the new DOMNode object, updates internal structures of the parent node(in our case it's a DOMDocument object), then creates and returns a new DOMNode object based on the prepared internal structures. Actually the returned object and the appended child node object are the same:

$ret_node = $doc->appendChild($node);
debug_zval_dump($node);
debug_zval_dump($ret_node);
var_dump(spl_object_hash($node));
var_dump(spl_object_hash($ret_node));

Output:

object(myDOMElement)#2 (18) refcount(3){
..
object(myDOMElement)#2 (18) refcount(3){
...
string(32) "00000000121277ac00000000658254f1"
string(32) "00000000121277ac00000000658254f1"

The DOMNode::$childNodes property read handler creates DOMNodeList iterator object. The current iterator value is fetched from a zval prepared by php_dom_iterator_move_forward. The latter only a "creates new object"(particularly, DOMNode) based on the internal XML structures.

But the way php_dom_create_object creates the objects is tricky! If the object is constructed first time, it saves the pointer by means of php_libxml_increment_node_ptr:

php_libxml_increment_node_ptr((php_libxml_node_object *)intern, obj, (void *)intern);

Next time it calls php_dom_create_object it detects the saved pointer, increments reference count, and returns the previously created object:

if ((intern = (dom_object *) php_dom_object_get_data((void *) obj))) {
  GC_REFCOUNT(&intern->std)++;
  ZVAL_OBJ(return_value, &intern->std);
  return 1;
}

In the free-object handler(which is called when the object is being destroyed) the DOM extension calls php_libxml_decrement_node_ptr.

As we can see, the DOM objects actually leave as long as any PHP variable. If the variable goes out of scope, it is destroyed. In this case the DOM extension will generate a new object for us.

Now let's add a destructor to the myDOMElement class:

class myDOMElement extends DOMElement
{
    public $myProp = 'Some default';

    public function __destruct() {
      echo __METHOD__, PHP_EOL;
    }
}

Then the following code will show that the DOMNode object is being destroyed at the line where we assign $doc->createElement('b') to it:

$node = $doc->createElement('a');
$node->myProp = 'A';
$doc->appendChild($node);

echo "Marker B-1\n";
$node = $doc->createElement('b');
echo "Marker B-2\n";
$node->myProp = 'B';
$doc->appendChild($node);

Output:

Marker B-1
myDOMElement::__destruct
Marker B-2

Since the DOM extension itself doesn't store the zval objects, the previous object stored in the $node variable goes out of scope and destroyed automatically. From now on we have no references to the PHP object. It's myProp property is also destroyed. However, the DOM extension will generate new instance for the a node, if we request it in a loop:

foreach ($doc->childNodes as $n) {
  var_dump($n->tagName);
}

Thus, the answer to your question

Why do I get "Some default" for tag a instead of the value "A"?

is: the object with $myProp = "A" is actually destroyed, because it goes out of scope when you assign another object to the $node variable, and the DOM extension doesn't store PHP objects for us - it delegates this responsibility to the user. However, the node is still present in the internal DOM structure. Therefore, when it comes to the A tag in the loop, the DOM extension generates new object with default properties.

Here is a workaround:

foreach (['a', 'b'] as $name) {
  $nodes[] = $node = $doc->createElement($name);
  $node->myProp = $name;
  $doc->appendChild($node);
}
foreach ($doc->childNodes as $n) {
  echo 'Tag ', $n->tagName, ' myProp:'; var_dump($n->myProp);
}
unset($nodes);

Output

Tag a myProp:string(1) "a"
Tag b myProp:string(1) "b"
like image 169
Ruslan Osmanov Avatar answered Oct 04 '22 04:10

Ruslan Osmanov