Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting a css class with xpath

Tags:

html

php

xml

web

xpath

I want to select just a class on its own called .date

For some reason, I cannot get this to work. If anyone knows what is wrong with my code, it would be much appreciated.

@$doc = new DOMDocument();
@$doc->loadHTML($html);
$xml = simplexml_import_dom($doc); // just to make xpath more simple
$images = $xml->xpath('//[@class="date"]');                             
foreach ($images as $img)
{
    echo  $img." ";
}
like image 738
Teddy13 Avatar asked Jan 10 '12 19:01

Teddy13


People also ask

How do I select a class in XPath?

Xpath class is defined as a selector that is usually shared by multiple elements in the document means it extracts all the class names. Nodes or lists of nodes are selected using XPath expressions based on property class names. The class name is separated by a Spaces. This token has white space.

Can I use XPath in CSS?

Though we have some browser plug-ins to generate xpath or css selector, but they are not much useful in real time applications. All the above syntax are simple. We can directly use them by using id or name locators. Here using xpath / Css, we can combine two locators when ever required, lets see how we can achieve.

Is it better to use CSS selector or XPath?

Advantages and disadvantages of CSS SelectorsPerformance is the same or faster compared to XPath. Easier to learn than XPath, easier to use. CSS Selector only allows unidirectional flow. Using a CSS Selector, we can only traverse from parent to child but not from the child to parent, which is possible with XPath.

Is CSS selector faster than XPath?

Advantages of Using CSS SelectorIt's faster than XPath. It's much easier to learn and implement. You have a high chance of finding your elements. It's compatible with most browsers to date.


4 Answers

I want to write the canonical answer to this question because the answer above has a problem.

Our problem

The CSS selector:

.foo 

will select any element that has the class foo.

How do you do this in XPath?

Although XPath is more powerful than CSS, XPath doesn't have a native equivalent of a CSS class selector. However, there is a solution.

The right way to do it

The equivalent selector in XPath is:

//*[contains(concat(" ", normalize-space(@class), " "), " foo ")] 

The function normalize-space strips leading and trailing whitespace (and also replaces sequences of whitespace characters by a single space).

(In a more general sense) this is also the equivalent of the CSS selector:

*[class~="foo"] 

which will match any element whose class attribute value is a list of whitespace-separated values, one of which is exactly equal to foo.

A couple of obvious, but wrong ways to do it

The XPath selector:

//*[@class="foo"] 

doesn't work! because it won't match an element that has more than one class, for example

<div class="foo bar"> 

It also won't match if there is any extra whitespace around the class name:

<div class="  foo "> 

The 'improved' XPath selector

//*[contains(@class, "foo")] 

doesn't work either! because it wrongly matches elements with the class foobar, for example

<div class="foobar"> 

Credit goes to this fella, who was the earliest published solution to this problem that I found on the web: http://dubinko.info/blog/2007/10/01/simple-parsing-of-space-seprated-attributes-in-xpathxslt/

like image 145
user716736 Avatar answered Sep 28 '22 18:09

user716736


//[@class="date"] is not a valid xpath.

Try //*[@class="date"], or if you know it is an image, //img[@class="date"]

like image 26
MrGlass Avatar answered Sep 28 '22 17:09

MrGlass


XPath 3.1 introduces a function contains-token and thus finally solves this ‘officially’. It is designed to support classes.

Example:

//*[contains-token(@class, "foo")]

This function makes sure that white space (not only (U+0020)) is handled correctly, works in case of class name repetition, and generally covers the edge cases.


Note: As of today (2016-12-13) XPath 3.1 has status of Candidate Recommendation.

like image 38
Robin Pokorny Avatar answered Sep 28 '22 18:09

Robin Pokorny


In XPath 2.0 you can:

//*[count(index-of(tokenize(@class, '\s+' ), 'foo')) = 1]

as stated by Christian Weiske in: https://cweiske.de/tagebuch/XPath%3A%20Select%20element%20by%20class.htm

like image 39
Memke Avatar answered Sep 28 '22 17:09

Memke