I want to select just a class on its own called .date
For some reason, I cannot get this to work. If anyone knows what is wrong with my code, it would be much appreciated.
@$doc = new DOMDocument();
@$doc->loadHTML($html);
$xml = simplexml_import_dom($doc); // just to make xpath more simple
$images = $xml->xpath('//[@class="date"]');
foreach ($images as $img)
{
echo $img." ";
}
Xpath class is defined as a selector that is usually shared by multiple elements in the document means it extracts all the class names. Nodes or lists of nodes are selected using XPath expressions based on property class names. The class name is separated by a Spaces. This token has white space.
Though we have some browser plug-ins to generate xpath or css selector, but they are not much useful in real time applications. All the above syntax are simple. We can directly use them by using id or name locators. Here using xpath / Css, we can combine two locators when ever required, lets see how we can achieve.
Advantages and disadvantages of CSS SelectorsPerformance is the same or faster compared to XPath. Easier to learn than XPath, easier to use. CSS Selector only allows unidirectional flow. Using a CSS Selector, we can only traverse from parent to child but not from the child to parent, which is possible with XPath.
Advantages of Using CSS SelectorIt's faster than XPath. It's much easier to learn and implement. You have a high chance of finding your elements. It's compatible with most browsers to date.
I want to write the canonical answer to this question because the answer above has a problem.
The CSS selector:
.foo
will select any element that has the class foo.
How do you do this in XPath?
Although XPath is more powerful than CSS, XPath doesn't have a native equivalent of a CSS class selector. However, there is a solution.
The equivalent selector in XPath is:
//*[contains(concat(" ", normalize-space(@class), " "), " foo ")]
The function normalize-space strips leading and trailing whitespace (and also replaces sequences of whitespace characters by a single space).
(In a more general sense) this is also the equivalent of the CSS selector:
*[class~="foo"]
which will match any element whose class attribute value is a list of whitespace-separated values, one of which is exactly equal to foo.
The XPath selector:
//*[@class="foo"]
doesn't work! because it won't match an element that has more than one class, for example
<div class="foo bar">
It also won't match if there is any extra whitespace around the class name:
<div class=" foo ">
The 'improved' XPath selector
//*[contains(@class, "foo")]
doesn't work either! because it wrongly matches elements with the class foobar, for example
<div class="foobar">
Credit goes to this fella, who was the earliest published solution to this problem that I found on the web: http://dubinko.info/blog/2007/10/01/simple-parsing-of-space-seprated-attributes-in-xpathxslt/
//[@class="date"]
is not a valid xpath.
Try //*[@class="date"]
, or if you know it is an image, //img[@class="date"]
XPath 3.1 introduces a function contains-token and thus finally solves this ‘officially’. It is designed to support classes.
Example:
//*[contains-token(@class, "foo")]
This function makes sure that white space (not only (U+0020)) is handled correctly, works in case of class name repetition, and generally covers the edge cases.
Note: As of today (2016-12-13) XPath 3.1 has status of Candidate Recommendation.
In XPath 2.0 you can:
//*[count(index-of(tokenize(@class, '\s+' ), 'foo')) = 1]
as stated by Christian Weiske in: https://cweiske.de/tagebuch/XPath%3A%20Select%20element%20by%20class.htm
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With