Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I clone Distinct XML structures without data in PHP?

Tags:

php

xml

I have an XML document that looks like this:

<root>

  <node/>

  <node>
    <sub>more</sub>
  </node>

  <node>
    <sub>another</sub>
  </node>

  <node>value</node>

</root>

Here's my pseudo-code:

import xml.

create empty-xml.

foreach child of imported-xml-root-node,

    recursively clone node structure without data.

    if clone does not match one already in empty-xml,
        then add clone to empty-xml.

I'm trying to get a result that looks like this:

<root>

  <node/>

  <node>
    <sub/>
  </node>

</root>

Note that my piddly example data is only 3 nodes deep. In production, there will be an unknown number of descendants, so an acceptable answer needs to handle variable node depths.


Failed Approaches

I have reviewed The DOMNode class which has a cloneNode method with a recursive option that I would like to use, although it would take some extra work to purge the data. But while the class contains a hasChildNodes function which returns a boolean, I can't find a way to actually return the collection of children.

$doc = new DOMDocument();
$doc->loadXML($xml);

$root_node = $doc->documentElement;

if ( $root_node->hasChildNodes() ) {

  // looking for something like this:
  // foreach ($root_node->children() as $child)
  //   $doppel = $child->cloneNode(true);

}

Secondly, I have tried my hand with the The SimpleXMLElement class which does have an awesome children method. Although it's lacking the recursive option, I built a simple function to surmount that. But the class is missing a clone/copyNode method, and my function is bloating into something nasty to compensate. Now I'm considering combining usage of the two classes so I've got access to both SimpleXMLElement::children and DOMDocument::cloneNode, but I can tell this is not going cleanly and surely this problem can be solved better.

$sxe = new SimpleXMLElement($xml);

$indentation = 0;

function getNamesRecursive( $xml, &$indentation )
{
    $indentation++;
    foreach($xml->children() as $child) {
        for($i=0;$i<$indentation;$i++)
          echo "\t";
        echo $child->getName() . "\n";
        getNamesRecursive($child,$indentation);
    }
    $indentation--;
}

getNamesRecursive($sxe,$indentation);
like image 740
Jeff Puckett Avatar asked Apr 25 '26 06:04

Jeff Puckett


1 Answers

Consider XSLT, the special-purpose language designed to transform XML files. And PHP maintains an XSLT 1.0 processor. You simply need to keep items of position 1 and copy only its elements not text.

XSLT (save as .xsl file to use below in php)

<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" omit-xml-declaration="yes" />
<xsl:strip-space elements="*"/>

  <!-- Identity Transform -->
  <xsl:template match="@*|node()">
    <xsl:copy>      
        <xsl:apply-templates select="@*|node()"/>      
    </xsl:copy>
  </xsl:template>

  <!-- Remove any nodes position greater than 2 -->
  <xsl:template match="*[position() &gt; 2]"/>   

  <!-- Copy only tags -->
  <xsl:template match="/*/*/*">
    <xsl:copy/>
  </xsl:template>

</xsl:transform>

PHP

// LOAD XML AND XSL FILES
$xml = new DOMDocument('1.0', 'UTF-8');
$xml->load('Input.xml');

$xslfile = new DOMDocument('1.0', 'UTF-8');
$xslfile->load('Script.xsl');

// TRANSFORM XML with XSLT
$proc = new XSLTProcessor;
$proc->importStyleSheet($xslfile); 
$newXml = $proc->transformToXML($xml);

// ECHO OUTPUT STRING
echo $newXml;
# <root>
#   <node/>
#   <node>
#     <sub/>
#   </node>
# </root>

// NEW DOM OBJECT
$final = new DOMDocument('1.0', 'UTF-8');
$final->loadXML($newXml);
like image 187
Parfait Avatar answered Apr 27 '26 02:04

Parfait