I'm in the process of writing a parser, and trying to do good error handling with exceptions.
The following sample code:
<?php
$xml = <<<XML
<?xml version="1.0"?>
<rootElem>
XML;
$reader = new XMLReader();
$reader->xml($xml, null, LIBXML_NOERROR | LIBXML_NOWARNING);
$reader->read();
Emits:
PHP Warning: XMLReader::read(): An Error Occured while reading in /Users/evert/code/xml/errortest.php on line 11
PHP Stack trace:
PHP 1. {main}() /Users/evert/code/xml/errortest.php:0
PHP 2. XMLReader->read() /Users/evert/code/xml/errortest.php:11
The addition of:
libxml_use_internal_errors(true);
Has no effect.
My goal is to check errors later (with libxml_get_errors()
), and throw an exception. I feel the only solution is the use of the silence (@
) operator, but this seems like a bad idea..
Note that when I don't pass the LIBXML
constants, nor use libxml_use_internal_errors
, I get a different error, such as:
PHP Warning: XMLReader::read(): /Users/evert/code/xml/:2: parser error : Extra content at the end of the document in /Users/evert/code/xml/errortest.php on line 11
This suggests that the underlying libxml library is indeed supressing the error, but within XMLReader an error is thrown anyway.
Looks like there is no way to suppress the warning other than to use @
, since php source for read()
has following lines:
retval = xmlTextReaderRead(intern->ptr);
if (retval == -1) {
php_error_docref(NULL TSRMLS_CC, E_WARNING, "An Error Occured while reading");
RETURN_FALSE;
} else {
RETURN_BOOL(retval);
}
So, only the actual parsing errors inside xmlTextReaderRead()
are being suppressed by the libxml_use_internal_errors(true);
or the options passed to XMLReader::xml()
.
From my understanding XMLReader, to validate document, have to conduct one full pass through all document.
What I'm doing is:
// Enable internal libxml errors
libxml_use_internal_errors(true);
$xml = new \XMLReader();
$xsd='myfile.xsd';
$xml->open('myfile.xml');
$xml->setSchema ($xsd);
// Conduct full pass through document. The only reason is to force validation.
while (@$xml->read()) { }; // empty loop
if (count(libxml_get_errors ())==0) {
echo "provided xml is well formed and xsd-valid";
// Now you can start processing without @ as document was validated against xsd and is xml-wellformed
}
else
echo "provided xml is wrong and/or not xsd-valid. stopping";
Of course you can check for the errors inside of the empty loop and then break immediately after first error. I've noticed that XMLReader do not fail completely after first error - it continues and brings array of issues which is useful. It might be useful sometimes to printout all issues found instead of break processing after first problem.
My biggest concern is what for isValid function exist in XMLReader :) I think this is in fact a kind of workaround but it works very well and validating before processing matches 95% of XMLReader use cases as it is used for large xml collections processing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With