Use PHP Simple HTML DOM Parser to find table cell and get contents of next sibling

Question

I am trying to use PHP Simple HTML DOM Parser to grab the HTML of an external file. The file contains a table and the goal is to find a able cell with specific data contents, and then get the next sibling cell's data. This data needs to be places into a PHP variable.

Based on the research and info found in articles like How to parse and process HTML/XML with PHP?, Grabbing the href attribute of an A element, Scraping Data: PHP Simple HTML DOM Parser and of course PHP Simple HTML DOM Parser Manual I've been able to produce some results, but I'm afraid I may be on the wrong track.

The table row looks like this:

<tr>
<td>fluff</td>  
<td>irrelevant</td> 
<td>etc</td>   
<td><a href="one">Hello world</a></td>                        
<td>123.456</td> 
<td>fluff</td>          
<td>irrelevant</td>   
<td>etc</td>
</tr>

What I'm trying to accomplish is to find the table cell that contains "Hello world", and then get the number from withing the next td cell. The following code finds that table cell and echoes its contents, but my attempts to use it as a landmark in order to get the next cell's data have failed...

$html = file_get_html("http://site.com/stuff.htm");
$e = $html->find('td',0)->innertext = 'Hello world';
echo $e;

So ultimately, in the example above the value of 123.456 needs to somehow get into a PHP variable.

Thanks for your help!

hek2mgl · Accepted Answer

It can be done using the DOMXPath class. You won't need an external library for this.

Here comes an example:

<?php

$html = <<<EOF
<tr>
<td>fluff</td>  
<td>irrelevant</td> 
<td>etc</td>   
<td><a href="one">Hello world</a></td>                        
<td>123.456</td> 
<td>fluff</td>          
<td>irrelevant</td>   
<td>etc</td>
</tr>
EOF;


// create empty document 
$document = new DOMDocument();

// load html
$document->loadHTML($html);

// create xpath selector
$selector = new DOMXPath($document);

// selects the parent node of <a> nodes
// which's content is 'Hello world'
$results = $selector->query('//td/a[text()="Hello world"]/..');

// output the results 
foreach($results as $node) {
    echo $node->nodeValue . PHP_EOL;
}

Adidi · Answer

using simple html dom parser:

$str = "<table><tr>
<td>fluff</td>  
<td>irrelevant</td> 
<td>etc</td>   
<td><a href=\"one\">Hello world</a></td>                        
<td>123.456</td> 
<td>fluff</td>          
<td>irrelevant</td>   
<td>etc</td>
</tr></table>";

$html = str_get_html($str);

 $tds = $html->find('table',0)->find('td');
 $num = null;
 foreach($tds as $td){

     if($td->plaintext == 'Hello world'){

        $next_td = $td->next_sibling();
        $num = $next_td->plaintext ;    
        break; 
     }
 }

 echo($num);

Use PHP Simple HTML DOM Parser to find table cell and get contents of next sibling

Tags:

dom

php

html-parsing

stotrami

2 Answers

hek2mgl

Adidi

Recent Activity

Donate For Us

Use PHP Simple HTML DOM Parser to find table cell and get contents of next sibling

Tags:

dom

php

html-parsing

stotrami

2 Answers

hek2mgl

Adidi

Related questions

Recent Activity

Donate For Us