Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP's SimpleXML doesn't keep order between different element types

As far as I can tell, when you have multiple types of elements at the same level in an XML document tree, PHP's SimpleXML, including SimpleXMLElement and SimpleXMLIterator both don't keep the order of the elements as they relate to each other, only within each element.

For example, consider the following structure:

<catalog>
    <book>
        <title>Harry Potter and the Chamber of Secrets</title>
        <author>J.K. Rowling</author>
    </book>
    <book>
        <title>Great Expectations</title>
        <author>Charles Dickens</author>
    </book>
</catalog>

If I had this structure and used either SimpleXMLIterator or SimpleXMLElement to parse it, I would end up with an array that looked something like this:

Array (
    [book] => Array (
        [0] => Array (
            [title] => Array (
                [0] => Harry Potter and the Chamber of Secrets
            )
            [author] => Array (
                [0] => J.K. Rowling
            )
        )
        [1] => Array (
            [title] => Array (
                [0] => Great Expectations
            )
            [author] => Array (
                [0] => Charles Dickens
            )
        )
    )
)

This would be fine, since I only have book elements, and it keeps the order properly within those elements. However, say I add movie elements, too:

<catalog>
    <book>
        <title>Harry Potter and the Chamber of Secrets</title>
        <author>J.K. Rowling</author>
    </book>
    <movie>
        <title>The Dark Knight</title>
        <director>Christopher Nolan</director>
    </movie>
    <book>
        <title>Great Expectations</title>
        <author>Charles Dickens</author>
    </book>
    <movie>
        <title>Avatar</title>
        <director>Christopher Nolan</director>
    </movie>
</catalog>

Parsing with SimpleXMLIterator or SimpleXMLElement would result in the following array:

Array (
    [book] => Array (
        [0] => Array (
            [title] => Array (
                [0] => Harry Potter and the Chamber of Secrets
            )
            [author] => Array (
                [0] => J.K. Rowling
            )
        )
        [1] => Array (
            [title] => Array (
                [0] => Great Expectations
            )
            [author] => Array (
                [0] => Charles Dickens
            )
        )
    )
    [movie] => Array (
        [0] => Array (
            [title] => Array (
                [0] => The Dark Knight
            )
            [director] => Array (
                [0] => Christopher Nolan
            )
        )
        [1] => Array (
            [title] => Array (
                [0] => Avatar
            )
            [director] => Array (
                [0] => James Cameron
            )
        )
    )
)

Because it represents the data this way, it seems that I have no way to tell that the order of the books and movies in the XML file was actually book, movie, book, movie. It just separates them into two categories (although it keeps the order within each category).

Does anyone know of a workaround, or a different XML parser that doesn't have this behavior?

like image 877
Josh Sherick Avatar asked May 04 '14 21:05

Josh Sherick


Video Answer


1 Answers

"If I ... used either SimpleXMLIterator or SimpleXMLElement to parse it, I would end up with an array" - no you wouldn't, you would end up with an object, which happens to behave like an array in certain ways.

The output of a recursive dump of that object is not the same as the result of iterating over it.

In particular, running foreach( $some_node->children() as $child_node ) will give you all the children of a node in the order they appear in the document, regardless of name, as shown in this live code demo.

Code:

$xml = <<<EOF
<catalog>
    <book>
        <title>Harry Potter and the Chamber of Secrets</title>
        <author>J.K. Rowling</author>
    </book>
    <movie>
        <title>The Dark Knight</title>
        <director>Christopher Nolan</director>
    </movie>
    <book>
        <title>Great Expectations</title>
        <author>Charles Dickens</author>
    </book>
    <movie>
        <title>Avatar</title>
        <director>Christopher Nolan</director>
    </movie>
</catalog>
EOF;

$sx = simplexml_load_string($xml);
foreach ( $sx->children() as $node )
{
    echo $node->getName(), '<br />';
}

Output:

book
movie
book
movie
like image 188
IMSoP Avatar answered Oct 05 '22 23:10

IMSoP