The easiest way to access a single element in the DOM is by its unique ID. You can get an element by ID with the getElementById() method of the document object. In the Console, get the element and assign it to the demoId variable. Logging demoId to the console will return our entire HTML element.
getElementsByClassName() The getElementsByClassName method of Document interface returns an array-like object of all child elements which have all of the given class name(s).
Update: Xpath version of *[@class~='my-class']
css selector
So after my comment below in response to hakre's comment, I got curious and looked into the code behind Zend_Dom_Query
. It looks like the above selector is compiled to the following xpath (untested):
[contains(concat(' ', normalize-space(@class), ' '), ' my-class ')]
So the PHP would be:
$dom = new DomDocument();
$dom->load($filePath);
$finder = new DomXPath($dom);
$classname="my-class";
$nodes = $finder->query("//*[contains(concat(' ', normalize-space(@class), ' '), ' $classname ')]");
Basically, all we do here is normalize the class
attribute so that even a single class is bounded by spaces, and the complete class list is bounded in spaces. Then append the class we are searching for with a space. This way we are effectively looking for and find only instances of my-class
.
Use an xpath selector?
$dom = new DomDocument();
$dom->load($filePath);
$finder = new DomXPath($dom);
$classname="my-class";
$nodes = $finder->query("//*[contains(@class, '$classname')]");
If it is only ever one type of element you can replace the *
with the particular tagname.
If you need to do a lot of this with very complex selector I would recommend Zend_Dom_Query
which supports CSS selector syntax (a la jQuery):
$finder = new Zend_Dom_Query($html);
$classname = 'my-class';
$nodes = $finder->query("*[class~=\"$classname\"]");
If you wish to get the innerhtml of the class without the zend you could use this:
$dom = new DomDocument();
$dom->load($filePath);
$classname = 'main-article';
$finder = new DomXPath($dom);
$nodes = $finder->query("//*[contains(concat(' ', normalize-space(@class), ' '), ' $classname ')]");
$tmp_dom = new DOMDocument();
foreach ($nodes as $node)
{
$tmp_dom->appendChild($tmp_dom->importNode($node,true));
}
$innerHTML.=trim($tmp_dom->saveHTML());
echo $innerHTML;
I think the accepted way is better, but I guess this might work as well
function getElementByClass(&$parentNode, $tagName, $className, $offset = 0) {
$response = false;
$childNodeList = $parentNode->getElementsByTagName($tagName);
$tagCount = 0;
for ($i = 0; $i < $childNodeList->length; $i++) {
$temp = $childNodeList->item($i);
if (stripos($temp->getAttribute('class'), $className) !== false) {
if ($tagCount == $offset) {
$response = $temp;
break;
}
$tagCount++;
}
}
return $response;
}
There is also another approach without the use of DomXPath
or Zend_Dom_Query
.
Based on dav's original function, I wrote the following function that returns all the children of the parent node whose tag and class match the parameters.
function getElementsByClass(&$parentNode, $tagName, $className) {
$nodes=array();
$childNodeList = $parentNode->getElementsByTagName($tagName);
for ($i = 0; $i < $childNodeList->length; $i++) {
$temp = $childNodeList->item($i);
if (stripos($temp->getAttribute('class'), $className) !== false) {
$nodes[]=$temp;
}
}
return $nodes;
}
suppose you have a variable $html
the following HTML:
<html>
<body>
<div id="content_node">
<p class="a">I am in the content node.</p>
<p class="a">I am in the content node.</p>
<p class="a">I am in the content node.</p>
</div>
<div id="footer_node">
<p class="a">I am in the footer node.</p>
</div>
</body>
</html>
use of getElementsByClass
is as simple as:
$dom = new DOMDocument('1.0', 'utf-8');
$dom->loadHTML($html);
$content_node=$dom->getElementById("content_node");
$div_a_class_nodes=getElementsByClass($content_node, 'div', 'a');//will contain the three nodes under "content_node".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With