Retrieve elements with xpath and DOMDocument

Tags:

I have a list of ads in the html code below. What I need is a PHP loop to get the folowing elements for each ad:

ad URL (href attribute of <a> tag)
ad image URL (src attribute of <img> tag)
ad title (html content of <div class="title"> tag)

<div class="ads">
    <a href="http://path/to/ad/1">
        <div class="ad">
            <div class="image">
                <div class="wrapper">
                    <img src="http://path/to/ad/1/image.jpg">
                </div>
            </div>
            <div class="detail">
                <div class="title">Ad #1</div>
            </div>
        </div>
    </a>
    <a href="http://path/to/ad/2">
        <div class="ad">
            <div class="image">
                <div class="wrapper">
                    <img src="http://path/to/ad/2/image.jpg">
                </div>
            </div>
            <div class="detail">
                <div class="title">Ad #2</div>
            </div>
        </div>
    </a>
</div>

I managed to get the ad URL with the PHP code below.

Click to copy

$d = new DOMDocument();
$d->loadHTML($ads); // the variable $ads contains the HTML code above
$xpath = new DOMXPath($d);
$ls_ads = $xpath->query('//a');

foreach ($ls_ads as $ad) {
    $ad_url = $ad->getAttribute('href');
    print("AD URL : $ad_url");
}

But I didn't manage to get the 2 other elements (image url and title). Any idea?

416

asked Sep 22 '12 20:09

user1691355

1 Answers

I managed to get what I need with this code (based on Khue Vu's code) :

Click to copy

$d = new DOMDocument();
$d->loadHTML($ads); // the variable $ads contains the HTML code above
$xpath = new DOMXPath($d);
$ls_ads = $xpath->query('//a');

foreach ($ls_ads as $ad) {
    // get ad url
    $ad_url = $ad->getAttribute('href');

    // set current ad object as new DOMDocument object so we can parse it
    $ad_Doc = new DOMDocument();
    $cloned = $ad->cloneNode(TRUE);
    $ad_Doc->appendChild($ad_Doc->importNode($cloned, True));
    $xpath = new DOMXPath($ad_Doc);

    // get ad title
    $ad_title_tag = $xpath->query("//div[@class='title']");
    $ad_title = trim($ad_title_tag->item(0)->nodeValue);

    // get ad image
    $ad_image_tag = $xpath->query("//img/@src");
    $ad_image = $ad_image_tag->item(0)->nodeValue;
}

170

answered Sep 28 '22 05:09

user1691355

Related questions
                            
                                Check if a variable is a natural number
                            
                                save an attribute value without saving its parent entity in Magento
                            
                                How to emulate __destruct() in a static class?
                            
                                Is this an optimization?
                            
                                How do you restart Apache with a (web) button click?
                            
                                Can I trust the file type from $_FILES?
                            
                                Hierarchy commenting system php
                            
                                PHP5 performance comparison, Windows and Linux
                            
                                PHP Object Property has brackets in it
                            
                                How do you organize your bundles in Symfony2 projects? [closed]
                            
                                How to correctly set mysql timezone
                            
                                Codeigniter: Column 'id' in order clause is ambiguous
                            
                                Is this a 1NF failure?
                            
                                Easiest way to get all static properties of a Class in php
                            
                                Symfony 2 - Accessing Hierarchical Roles in a twig template
                            
                                CakePHP and tinyint as boolean
                            
                                How to encrypt non-blocking PHP socket streams?
                            
                                How to ManyToMany and OneToMany in Symfony and Doctrine?
                            
                                Store a PHP array in a single SQL cell
                            
                                PHP Syntax Checking with lint and how to do this on a String, not a File

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Retrieve elements with xpath and DOMDocument

Tags:

php

xpath

domdocument

user1691355

People also ask

1 Answers

user1691355

Recent Activity

Donate For Us