Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is registerNamespace necessary in PHP's DOMXPath?

I'm working with an XML like this: (it's a standard container.xml in an epub book)

<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
   <rootfiles>
      <rootfile full-path="OEBPS/9780765348210.opf" media-type="application/oebps-package+xml"/>
   </rootfiles>
</container>

I'm trying to parse it using PHP. This is my code so far:

$c = new DOMDocument();
$c->load($filename);
$x = new DOMXPath($c);
//fine up to here!

//is this even what I'm supposed to be doing?
$x->registerNamespace('epub', 'urn:oasis:names:tc:opendocument:xmlns:container');
$root = $x->query('/epub:container/epub:rootfiles/epub:rootfile');

//fine down from here!
$opf = $root->item(0)->getAttribute('full-path'); //I know I should check if the element's there and if it has the attribute. Not important.

My question is: Is there a way not to do that registerNamespace call? I'm not sure if different epubs set this value a bit differently, and I need this code to work on any epub I throw at it.

like image 498
cambraca Avatar asked Aug 07 '12 00:08

cambraca


2 Answers

AFAIK: no. XML documents can suffer from name collisions therefore namespaces are used. You cannot use XPath on XML documents without registering one or more namespaces and setting up prefixes for them.

In your example the XML is declaring a default namespace (xmlns="<namespace identifier>"), in which case all elements without one or more namespace prefix will fall under the default namespace. As long as you know that what you're looking for is in this default namespace then there is something a bit easier: what you can do instead is not to hard-code the default namespace and fetch it like this:

// ... load the DOMDocument ...

$defaultNamespace = $c->lookupNamespaceURI($c->namespaceURI);
$x->registerNamespace('epub', $defaultNamespace);

// ... now query like in your example
$root = $x->query('/epub:container/epub:rootfiles/epub:rootfile');
like image 112
Max Avatar answered Oct 25 '22 11:10

Max


To elaborate on Max's response, you technically can get around having to register a namespace on the DOMXPath if your XML document itself does not declare a default namespace. That means all elements within the document would not be associated with any namespace. However, since you're working with what appears to be an industry standard, my guess is declaring that namespace in the XML document itself is necessary. If your XML document looked like the below, then you could skip the registerNamespace declaration and not have to use the namespace prefix of 'epub' in your queries.

<?xml version="1.0"?>
<container version="1.0">
   <rootfiles>
      <rootfile full-path="OEBPS/9780765348210.opf" media-type="application/oebps-package+xml"/>
   </rootfiles>
</container>

Most XML documents that aren't used exclusively within a single organization will have a default namespace declared, though.

like image 1
David Tran Avatar answered Oct 25 '22 12:10

David Tran