I'm trying to parse some HTML with DOM in PHP, but I'm having some problems. First, in case this change the solution, the HTML that I have is not a full page, rather, it's only part of it.
<!-- This is the HTML that I have --><a href='/games/'>
<div id='game'>
<img src='http://images.example.com/games.gif' width='300' height='137' border='0'>
<br><b> Game </b>
</div>
<div id='double'>
<img src='http://images.example.com/double.gif' width='300' height='27' border='0' alt='' title=''>
</div>
</a>
Now I'm trying to get only the div with the id double
. I've tried the following code, but it doesn't seem to be working properly. What might I be doing wrong?
//The HTML has been loaded into the variable $html
$dom=new domDocument;
$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$keepme = $dom->getElementById('double');
$contents = '<div style="text-align:center">'.$keepme.'</a></div>';
echo $contents;
I think DOMDocument::getElementById
will not work in your case : (quoting)
For this function to work, you will need either to set some ID attributes with
DOMElement::setIdAttribute
or a DTD which defines an attribute to be of type ID.
In the later case, you will need to validate your document withDOMDocument::validate
orDOMDocument->validateOnParse
before using this function.
A solution that might work is using some XPath query to extract the element you are looking for.
First of all, let's load the HTML portion, like you first did :
$dom=new domDocument;
$dom->loadHTML($html);
var_dump($dom->saveHTML());
The var_dump
is here only to prove that the HTML portion has been loaded successfully -- judging from its output, it has.
Then, instanciate the DOMXPath
class, and use it to query for the element you want to get :
$xpath = new DOMXpath($dom);
$result = $xpath->query("//*[@id = 'double']");
$keepme = $result->item(0);
We now have to element you want ;-)
But, in order to inject its HTML content in another HTML segment, we must first get its HTML content.
I don't remember any "easy" way to do that, but something like this sould do the trick :
$tempDom = new DOMDocument();
$tempImported = $tempDom->importNode($keepme, true);
$tempDom->appendChild($tempImported);
$newHtml = $tempDom->saveHTML();
var_dump($newHtml);
And... We have the HTML content of your double
<div>
:
string '<div id="double">
<img src="http://images.example.com/double.gif" width="300" height="27" border="0" alt="" title="">
</div>
' (length=125)
Now, you just have to do whatever you want with it ;-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With