<p>I'm new to DOM parsing in PHP:<br> I have a HTML file that I'm trying to parse. It has a bunch of DIVs like this:</p> <pre class="prettyprint"><code><div id="interestingbox"> <div id="interestingdetails" class="txtnormal"> <div>Content1</div> <div>Content2</div> </div> </div> <div id="interestingbox"> ...... </code></pre> <p>I'm trying to get the contents of the many div boxes using php. How can I use the DOM parser to do this?</p> <p>Thanks!</p>

<p>First i have to tell you that you can't use the same id on two different divs; there are classes for that point. Every element should have an unique id. </p> <p>Code to get the contents of the div with id="interestingbox"</p> <pre class="prettyprint"><code>$html = ' <html> <head></head> <body> <div id="interestingbox"> <div id="interestingdetails" class="txtnormal"> <div>Content1</div> <div>Content2</div> </div> </div> <div id="interestingbox2"><a href="#">a link</a></div> </body> </html>'; $dom_document = new DOMDocument(); $dom_document->loadHTML($html); //use DOMXpath to navigate the html with the DOM $dom_xpath = new DOMXpath($dom_document); // if you want to get the div with id=interestingbox $elements = $dom_xpath->query("*/div[@id='interestingbox']"); if (!is_null($elements)) { foreach ($elements as $element) { echo "\n[". $element->nodeName. "]"; $nodes = $element->childNodes; foreach ($nodes as $node) { echo $node->nodeValue. "\n"; } } } //OUTPUT [div] { Content1 Content2 } </code></pre> <p>Example with classes: </p> <pre class="prettyprint"><code>$html = ' <html> <head></head> <body> <div class="interestingbox"> <div id="interestingdetails" class="txtnormal"> <div>Content1</div> <div>Content2</div> </div> </div> <div class="interestingbox"><a href="#">a link</a></div> </body> </html>'; //the same as before.. just change the xpath [...] $elements = $dom_xpath->query("*/div[@class='interestingbox']"); [...] //OUTPUT [div] { Content1 Content2 } [div] { a link } </code></pre> <p>Refer to the DOMXPath page for more details.</p>

<p>I got this to work using simplehtmldom as a start:</p> <pre class="prettyprint"><code>$html = file_get_html('example.com'); foreach ($html->find('div[id=interestingbox]') as $result) { echo $result->innertext; } </code></pre>

how to use dom php parser

Tags:

dom

php

html-parsing

I'm new to DOM parsing in PHP:
I have a HTML file that I'm trying to parse. It has a bunch of DIVs like this:

<div id="interestingbox"> 
   <div id="interestingdetails" class="txtnormal">
        <div>Content1</div>
        <div>Content2</div>
   </div>
</div>

<div id="interestingbox"> 
......

I'm trying to get the contents of the many div boxes using php. How can I use the DOM parser to do this?

Thanks!

583

asked Jun 06 '09 23:06

chris

2 Answers

First i have to tell you that you can't use the same id on two different divs; there are classes for that point. Every element should have an unique id.

Code to get the contents of the div with id="interestingbox"

$html = '
<html>
<head></head>
<body>
<div id="interestingbox"> 
   <div id="interestingdetails" class="txtnormal">
        <div>Content1</div>
        <div>Content2</div>
   </div>
</div>

<div id="interestingbox2"><a href="#">a link</a></div>
</body>
</html>';


$dom_document = new DOMDocument();

$dom_document->loadHTML($html);

//use DOMXpath to navigate the html with the DOM
$dom_xpath = new DOMXpath($dom_document);

// if you want to get the div with id=interestingbox
$elements = $dom_xpath->query("*/div[@id='interestingbox']");

if (!is_null($elements)) {

  foreach ($elements as $element) {
    echo "\n[". $element->nodeName. "]";

    $nodes = $element->childNodes;
    foreach ($nodes as $node) {
      echo $node->nodeValue. "\n";
    }

  }
}

//OUTPUT
[div]  {
        Content1
        Content2
}

Example with classes:

$html = '
<html>
<head></head>
<body>
<div class="interestingbox"> 
   <div id="interestingdetails" class="txtnormal">
        <div>Content1</div>
        <div>Content2</div>
   </div>
</div>

<div class="interestingbox"><a href="#">a link</a></div>
</body>
</html>';

//the same as before.. just change the xpath

[...]

$elements = $dom_xpath->query("*/div[@class='interestingbox']");

[...]

//OUTPUT
[div]  {
        Content1
        Content2
}

[div]  {
a link
}

Refer to the DOMXPath page for more details.

answered Oct 05 '22 22:10

apelliciari

I got this to work using simplehtmldom as a start:

$html = file_get_html('example.com');
foreach ($html->find('div[id=interestingbox]') as $result)
{
    echo $result->innertext;
}

answered Oct 05 '22 23:10

chris

Related questions
                            
                                Php : Finding Chrome and Safari Browsers
                            
                                PHP Convert Windows-1251 to UTF 8
                            
                                Undefined index with PHP sessions
                            
                                PHP static method call with variable class name and namespaces
                            
                                Phpdoc No Summary found for this file
                            
                                Modify microseconds of a PHP DateTime object
                            
                                How to integrate AngularJs App inside Joomla
                            
                                Wrap Text in Fpdf in Php
                            
                                Laravel: Route::resource() GET & POST work, but PUT & DELETE throw MethodNotAllowedHttpException
                            
                                Integrate Bootstrap on CodeIgniter
                            
                                Converting Illuminate\Http\Request to array
                            
                                Laravel 5.1 No query results for model in queue
                            
                                PHP generator yield the first value, then iterate over the rest
                            
                                Error when installed Yii2
                            
                                Call to a member function on null?
                            
                                PHPUnit installation on PhpStorm
                            
                                Laravel validate at least one item in a form array
                            
                                Woocommerce: Get all orders for a product
                            
                                Check if validation failed in laravel
                            
                                in laravel 8 with seeding , i has this issue Target class [TableSeeder] does not exist

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

how to use dom php parser

Tags:

dom

php

html-parsing

chris

People also ask

2 Answers

apelliciari

chris

Recent Activity

Donate For Us