Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Warning: DOMDocument::loadXML() [function.DOMDocument-loadXML]: Entity 'laquo' not defined in Entity

Tags:

php

parsing

xml

I intrecept server's response using xml,xsl and extract required fragments, to extract html fragments from server reponse on client requrest. For example, lets suppose, that $content have server response, before we process it.

    $dom = new domDocument();
    $dom->loadXML($content);
    $xslProgram = <<<xslProgram
<xsl:stylesheet version='1.0'
 xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>

<xsl:output method="html" encoding='UTF-8' indent="yes" />

<xsl:template match="/">
    <xsl:copy-of select="$select" />
</xsl:template>

</xsl:stylesheet>
xslProgram;

    $domXsl = new domDocument();
    $domXsl->loadXML($xslProgram);
    $xsl = new XSLTProcessor();
    $xsl->importStylesheet($domXsl);

    $content = $xsl->transformToXml($dom);

It looks like everything is working correct, but when it detects &nbsp, &laquo, &raquo, etc, there is a message appears "Warning: DOMDocument::loadXML() [function.DOMDocument-loadXML]: Entity 'laquo' not defined in Entity"

At first I just replaced all this elements (&nbsp and others) with their unicode equiavalents ( str_replace), but then I understand that I can't consider all this variants. How can I solve this problem?

Let me know if you don't undestand me, I can write better explanation.

Thanks, Ahmed.

like image 364
Akhmed Avatar asked Dec 28 '22 23:12

Akhmed


2 Answers

The HTML entities are not defined in XML, this is why you get those errors. Have you considered using loadHTML() for your input document instead of loadXML()?

$dom = new domDocument();
$dom->loadHTML($content);

http://php.net/manual/en/domdocument.loadhtml.php

like image 186
Tomalak Avatar answered May 09 '23 18:05

Tomalak


I think that if you passed $content through html_entity_decode first, your problems would disappear.

like image 21
smcphill Avatar answered May 09 '23 19:05

smcphill