I'm failing to comprehend why do we need 2 XML parsers in PHP.
Can someone explain the difference between those two?
SimpleXML is an extension that allows us to easily manipulate and get XML data. SimpleXML provides an easy way of getting an element's name, attributes and textual content if you know the XML document's structure or layout.
There is no relation between PHP and XML. XML is something that PHP can consume and produce. There is nowhere during processing that PHP consumes or produces XML unless you explicitly tell PHP to do so.
I'm going to make the shortest answer possible so that beginners can take it away easily. I'm also slightly simplifying things for shortness' sake. Jump to the end of that answer for the overstated TL;DR version.
DOM and SimpleXML aren't actually two different parsers. The real parser is libxml2, which is used internally by DOM and SimpleXML. So DOM/SimpleXML are just two ways to use the same parser and they provide ways to convert one object to another.
SimpleXML is intended to be very simple so it has a small set of functions, and it is focused on reading and writing data. That is, you can easily read or write a XML file, you can update some values or remove some nodes (with some limitations!), and that's it. No fancy manipulation, and you don't have access to the less common node types. For instance, SimpleXML cannot create a CDATA section although it can read them.
DOM offers a full-fledged implementation of the DOM plus a couple of non-standard methods such as appendXML. If you're used to manipulate DOM in Javascript, you'll find exactly the same methods in PHP's DOM. There's basically no limitation in what you can do and it evens handles HTML. The flipside to this richness of features is that it is more complex and more verbose than SimpleXML.
People often wonder/ask what extension they should use to handle their XML or HTML content. Actually the choice is easy because there isn't much of a choice to begin with:
In a nutshell:
SimpleXml
$root->foo->bar['attribute']
DOM
Both of these are based on libxml and can be influenced to some extend by the libxml functions
Personally, I dont like SimpleXml too much. That's because I dont like the implicit access to the nodes, e.g. $foo->bar[1]->baz['attribute']
. It ties the actual XML structure to the programming interface. The one-node-type-for-everything is also somewhat unintuitive because the behavior of the SimpleXmlElement magically changes depending on it's contents.
For instance, when you have <foo bar="1"/>
the object dump of /foo/@bar
will be identical to that of /foo
but doing an echo of them will print different results. Moreover, because both of them are SimpleXml elements, you can call the same methods on them, but they will only get applied when the SimpleXmlElement supports it, e.g. trying to do $el->addAttribute('foo', 'bar')
on the first SimpleXmlElement will do nothing. Now of course it is correct that you cannot add an attribute to an Attribute Node, but the point is, an attribute node would not expose that method in the first place.
But that's just my 2c. Make up your own mind :)
On a sidenote, there is not two parsers, but a couple more in PHP. SimpleXml and DOM are just the two that parse a document into a tree structure. The others are either pull or event based parsers/readers/writers.
Also see my answer to
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With