A little new to PHP parsing here, but I can't seem to get PHP's DOMDocument to return what is clearly an identifiable node. The HTML loaded will come from the 'net so can't necessarily guarantee XML compliance, but I try the following:
<?php
header("Content-Type: text/plain");
$html = '<html><body>Hello <b id="bid">World</b>.</body></html>';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->validateOnParse = true;
/*** load the html into the object ***/
$dom->loadHTML($html);
var_dump($dom);
$belement = $dom->getElementById("bid");
var_dump($belement);
?>
Though I receive no error, I only receive the following as output:
object(DOMDocument)#1 (0) {
}
NULL
Should I not be able to look up the <b>
tag as it does indeed have an id?
The DOMDocument::getElementById() function is an inbuilt function in PHP which is used to search for an element with a certain id. Parameters:This function accepts a single parameter $elementId which holds the id to search for. Return Value: This function returns the DOMElement or NULL if the element is not found.
getelementbyid(...) is null would seem to indicate that there is no such element with an ID passed to getElementById() exist. This can happen if the JavaScript code is executed before the page is fully loaded, so its not able to find the element.
HTML DOM Document getElementById() The getElementById() method returns an element with a specified value. The getElementById() method returns null if the element does not exist. The getElementById() method is one of the most common methods in the HTML DOM.
The Manual explains why:
For this function to work, you will need either to set some ID attributes with DOMElement->setIdAttribute() or a DTD which defines an attribute to be of type ID. In the later case, you will need to validate your document with DOMDocument->validate() or DOMDocument->validateOnParse before using this function.
By all means, go for valid HTML & provide a DTD.
Quick fixes:
$dom->validate();
and put up with the errors (or fix them), afterwards you can use $dom->getElementById()
, regardless of the errors for some reason.$x = new DOMXPath($dom); $el = $x->query("//*[@id='bid']")->item(0);
validateOnParse
to true before loading the HTML, if would also work ;P.
$dom = new DOMDocument();
$html ='<html>
<body>Hello <b id="bid">World</b>.</body>
</html>';
$dom->validateOnParse = true; //<!-- this first
$dom->loadHTML($html); //'cause 'load' == 'parse
$dom->preserveWhiteSpace = false;
$belement = $dom->getElementById("bid");
echo $belement->nodeValue;
Outputs 'World' here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With