This is mycode
<?php
/**
* @author Joomlacoders
* @copyright 2010
*/
$url="http://urlchecker.net/html/demo.html";
$innerHtml=file_get_contents($url);
//echo $innerHtml;
preg_match_all("{\<div id='news-id-.*d'\>(.*)\</div\>}",$innerHtml,$matches);
//<div id='news-id-160346'>
var_dump($matches);
?>
I want find all content in div id='news-id-160346'. Please help me
Use an HTML parser. NOT regular expressions.
The problem with regular expressions is that they cannot match nested structures. Assuming your regex must match a single <div>
and its closing tag, there is no way to correctly match this input:
<div id="a">
<div id="b">
Foo
</div>
</div>
<div id="c">
Bar
</div>
Because if your regular expression is greedy, it will match the two uppermost divs, and if it's ungreedy, it will not match the correct end tag.
Therefore, you should use an HTML parser. With PHP, DOMDocument::loadHTML
or DOMDocument::loadHTMLFile
each do a fairly good job. (You may "safely" ignore the warnings it generates: they're only markup errors, and the generated DOMDocument
object should be pretty much okay.)
Since the PHP getElementById is a pain to get to work, you can use DOMXpath for the same purpose:
<?php
$url = "http://urlchecker.net/html/demo.html";
$d = new DOMDocument();
$d->loadHTMLFile($url);
$xpath = new DOMXPath($d);
$myNews = $xpath->query('//@id="news-id-160346"')->item(0);
?>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With