Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Validate XML using a custom DTD in PHP

Is there a way (without installing any libraries) of validating XML using a custom DTD in PHP?

like image 820
michael Avatar asked Sep 19 '08 13:09

michael


People also ask

How validate XML in PHP?

The DOMDocument::validate() function is an inbuilt function in PHP which is used to validate the document based on its DTD (Document Type Definition). DTD defines the rules or structure to be followed by the XML file and if a XML document doesn't follows this format then this function will return false.

How do I know if a DTD is valid?

Use the online validator to check the validaty of your XML DTD here. Write your own XML validators with XML DTD validation API − Newer versions of JDK (above 1.4) support XML DTD validation API. You can write your own validator code to check the validity of XML DTD validation.

How do I validate an XML file?

XML documents are validated by the Create method of the XmlReader class. To validate an XML document, construct an XmlReaderSettings object that contains an XML schema definition language (XSD) schema with which to validate the XML document.


4 Answers

Take a look at PHP's DOM, especially DOMDocument::schemaValidate and DOMDocument::validate.

The example for DOMDocument::validate is fairly simple:

<?php
$dom = new DOMDocument;
$dom->Load('book.xml');
if ($dom->validate()) {
    echo "This document is valid!\n";
}
?>
like image 177
owenmarshall Avatar answered Oct 22 '22 18:10

owenmarshall


If you have the dtd in a string, you can validate against it by using a data wrapper for the dtd:

$xml = '<?xml version="1.0"?>
        <!DOCTYPE note SYSTEM "note.dtd">
        <note>
            <to>Tove</to>
            <from>Jani</from>
            <heading>Reminder</heading>
            <body>Don\'t forget me this weekend!</body>
        </note>';

$dtd = '<!ELEMENT note (to,from,heading,body)>
        <!ELEMENT to (#PCDATA)>
        <!ELEMENT from (#PCDATA)>
        <!ELEMENT heading (#PCDATA)>
        <!ELEMENT body (#PCDATA)>';


$root = 'note';

$systemId = 'data://text/plain;base64,'.base64_encode($dtd);

$old = new DOMDocument;
$old->loadXML($xml);

$creator = new DOMImplementation;
$doctype = $creator->createDocumentType($root, null, $systemId);
$new = $creator->createDocument(null, null, $doctype);
$new->encoding = "utf-8";

$oldNode = $old->getElementsByTagName($root)->item(0);
$newNode = $new->importNode($oldNode, true);
$new->appendChild($newNode);

if (@$new->validate()) {
    echo "Valid";
} else {
    echo "Not valid";
}
like image 32
Søren Jacobi Avatar answered Oct 22 '22 19:10

Søren Jacobi


My interpretation of the original question is that we have an "on board" XML file that we want to validate against an "on board" DTD file. So here's how I would implement the "interpolate a local DTD inside the DOCTYPE element" idea expressed in comments by both Soren and PayamRWD:

public function validate($xml_realpath, $dtd_realpath=null) {
    $xml_lines = file($xml_realpath);
    $doc = new DOMDocument;
    if ($dtd_realpath) {
        // Inject DTD inside DOCTYPE line:
        $dtd_lines = file($dtd_realpath);
        $new_lines = array();
        foreach ($xml_lines as $x) {
            // Assume DOCTYPE SYSTEM "blah blah" format:
            if (preg_match('/DOCTYPE/', $x)) {
                $y = preg_replace('/SYSTEM "(.*)"/', " [\n" . implode("\n", $dtd_lines) . "\n]", $x);
                $new_lines[] = $y;
            } else {
                $new_lines[] = $x;
            }
        }
        $doc->loadXML(implode("\n", $new_lines));
    } else {
        $doc->loadXML(implode("\n", $xml_lines));
    }
    // Enable user error handling
    libxml_use_internal_errors(true);
    if (@$doc->validate()) {
        echo "Valid!\n";
    } else {
        echo "Not valid:\n";
        $errors = libxml_get_errors();
        foreach ($errors as $error) {
            print_r($error, true);
        }
    }
}

Note that error handling has been suppressed for brevity, and there may be a better/more general way to handle the interpolation. But I have actually used this code with real data, and it works with PHP version 5.2.17.

like image 21
Peter Avatar answered Oct 22 '22 17:10

Peter


Trying to complete "owenmarshall" answer:

in xml-validator.php:

add html, header, body, ...

<?php

$dom = new DOMDocument; <br/>
$dom->Load('template-format.xml');<br/>
if ($dom->validate()) { <br/>
    echo "This document is valid!\n"; <br/>
}

?>

template-format.xml:

<?xml version="1.0" encoding="utf-8"?>

<!-- DTD to Validate against (format example) -->

<!DOCTYPE template-format [  <br/>
  <!ELEMENT template-format (template)>  <br/>
  <!ELEMENT template (background-color, color, font-size, header-image)>  <br/>
  <!ELEMENT background-color   (#PCDATA)>  <br/>
  <!ELEMENT color (#PCDATA)>  <br/>
  <!ELEMENT font-size (#PCDATA)>  <br/>
  <!ELEMENT header-image (#PCDATA)>  <br/>
]>

<!-- XML example -->

<template-format>

<template>

<background-color>&lt;/background-color>  <br/>
<color>&lt;/color>  <br/>
<font-size>&lt;/font-size>  <br/>
<header-image>&lt;/header-image>  <br/>

</template> 

</template-format>
like image 24
PayamRWD Avatar answered Oct 22 '22 18:10

PayamRWD