I was trying to do it with "getElementsByTagName", but it wasn't working, I'm new to using DOMDocument to parse HTML, as I used to use regex until yesterday some kind fokes here told me that DOMEDocument would be better for the job, so I'm giving it a try :)
I google around for a while looking for some explains but didn't find anything that helped (not with the class anyway)
So I want to capture "Capture this text 1" and "Capture this text 2" and so on.
Doesn't look to hard, but I can't figure it out :(
<div class="main"> <div class="text"> Capture this text 1 </div> </div> <div class="main"> <div class="text"> Capture this text 2 </div> </div>
To add the dynamic data (HTML content) at a certain point in PHP code, we need parsing. For example: For adding the data (info) in the form of HTML, we need to make that dynamic template in string and then convert it to HTML. How should we do parsing? We should use loadHTML() function for parsing.
If you just want to parse HTML and your HTML is intended for the body of your document, you could do the following : (1) var div=document. createElement("DIV"); (2) div. innerHTML = markup; (3) result = div. childNodes; --- This gives you a collection of childnodes and should work not just in IE8 but even in IE6-7.
Parsing means analyzing and converting a program into an internal format that a runtime environment can actually run, for example the JavaScript engine inside browsers. The browser parses HTML into a DOM tree. HTML parsing involves tokenization and tree construction.
If you want to get :
<div>
tag with class="text"
<div>
with class="main"
I would say the easiest way is not to use DOMDocument::getElementsByTagName
-- which will return all tags that have a specific name (while you only want some of them).
Instead, I would use an XPath query on your document, using the DOMXpath
class.
For example, something like this should do, to load the HTML string into a DOM object, and instance the DOMXpath
class :
$html = <<<HTML <div class="main"> <div class="text"> Capture this text 1 </div> </div> <div class="main"> <div class="text"> Capture this text 2 </div> </div> HTML; $dom = new DOMDocument(); $dom->loadHTML($html); $xpath = new DOMXPath($dom);
And, then, you can use XPath queries, with the DOMXPath::query
method, that returns the list of elements you were searching for :
$tags = $xpath->query('//div[@class="main"]/div[@class="text"]'); foreach ($tags as $tag) { var_dump(trim($tag->nodeValue)); }
And executing this gives me the following output :
string 'Capture this text 1' (length=19) string 'Capture this text 2' (length=19)
You can use http://simplehtmldom.sourceforge.net/
It is very simple easy to use DOM parser written in php, by which you can easily fetch the content of div tag.
Something like this:
// Find all <div> which have attribute id=text $ret = $html->find('div[id=text]');
See the documentation of it for more help.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With