Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP crawler for one special HTML element

Tags:

html

php

We have this simple HTML page (for test!) :

<html>
<body>
<div class="my"> One </div>
<div class="my"> Two </div>
<div class="my"> Three </div>
<div class="other"> NO </div>
<div class="other2"> NO </div>
</body>
</html>

So, i need a very simple php code to crawl. The thing i want to be crawled, is that i want to have : "one","two","three" into a php array.I need to crawl everything that is into "my" class. And i don't want to have the other classes.

like image 529
user3271403 Avatar asked Jan 26 '26 18:01

user3271403


2 Answers

try this you can use xpath to get your result

$html = '<html>
            <body>
            <div class="my"> One </div>
            <div class="my"> Two </div>
            <div class="my"> Three </div>
            <div class="other"> NO </div>
            <div class="other2"> NO </div>
            </body>
        </html>';

$dom = new DOMDocument();
$dom->loadHTML($html);

$xpath = new DOMXPath($dom);
$tags = $xpath->query('//div[@class="my"]');
foreach ($tags as $tag) {
    $node_value = trim($tag->nodeValue);
    echo $node_value."<br/>";
}
like image 166
Satish Sharma Avatar answered Jan 29 '26 07:01

Satish Sharma


You should make use of the DOMDocument Class

<?php

$html='<html>
<body>
<div class="my"> One </div>
<div class="my"> Two </div>
<div class="my"> Three </div>
<div class="other"> NO </div>
<div class="other2"> NO </div>
</body>
</html>';
$dom = new DOMDocument;
$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('div') as $tag) {
    if ($tag->getAttribute('class') === 'my') {
        echo $tag->nodeValue; // to get the content in between of tags...
    }
}

OUTPUT :

One Two Three
like image 29
Shankar Narayana Damodaran Avatar answered Jan 29 '26 08:01

Shankar Narayana Damodaran