In PHP, I am trying to validate an XML document using a DTD specified by my application - not by the externally fetched XML document. The validate method in the DOMDocument class seems to only validate using the DTD specified by the XML document itself, so this will not work.
Can this be done, and how, or do I have to translate my DTD to an XML schema so I can use the schemaValidate method?
(this seems to have been asked in Validate XML using a custom DTD in PHP but without correct answer, since the solution only relies on DTD speicified by the target XML)
An XML document that is well created can be validated using DTD (Document Type Definition) or XSD (XML Schema Definition). A well-formed XML document should have correct syntax and should follow the below rules: It must start with the XML declaration. It must have one unique root element enclosing all the other tags.
The DOMDocument::validate() function is an inbuilt function in PHP which is used to validate the document based on its DTD (Document Type Definition). DTD defines the rules or structure to be followed by the XML file and if a XML document doesn't follows this format then this function will return false.
You can validate your XML documents against XML schemas only; validation against DTDs is not supported. However, although you cannot validate against DTDs, you can insert documents that contain a DOCTYPE or that refer to DTDs.
Note: XML validation could be subject to the Billion Laughs attack, and similar DoS vectors.
This essentially does what rojoca mentioned in his comment:
<?php
$xml = <<<END
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE foo SYSTEM "foo.dtd">
<foo>
<bar>baz</bar>
</foo>
END;
$root = 'foo';
$old = new DOMDocument;
$old->loadXML($xml);
$creator = new DOMImplementation;
$doctype = $creator->createDocumentType($root, null, 'bar.dtd');
$new = $creator->createDocument(null, null, $doctype);
$new->encoding = "utf-8";
$oldNode = $old->getElementsByTagName($root)->item(0);
$newNode = $new->importNode($oldNode, true);
$new->appendChild($newNode);
$new->validate();
?>
This will validate the document against the bar.dtd
.
You can't just call $new->loadXML()
, because that would just set the DTD to the original, and the doctype
property of a DOMDocument object is read-only, so you have to copy the root node (with everything in it) to a new DOM document.
I only just had a go with this myself, so I'm not entirely sure if this covers everything, but it definitely works for the XML in my example.
Of course, the quick-and-dirty solution would be to first get the XML as a string, search and replace the original DTD by your own DTD and then load it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With