How to find all elements containing the word "download" using Selenium x-path?

Tags:

I'm using Selenium to do some webscraping and I now want to find all elements on which the user can click and which contain the word "download" (in any capitalization) in either the link text, the button text, the element id, the element class or the href. This can include both links, buttons or any other element.

In this answer I found the an xpath for somebody looking for an xpath to search for buttons based on a certain text (or non-case-sensitive and partial matches):

text = 'download'
driver.find_elements_by_xpath("(//*[contains(text(), 'download')]")

but on this page that returns no results, even though the following link is in there:

<a id="downloadTop" class="navlink" href="javascript:__doPostBack('downloadTop','')">Download</a>

Does anybody know how I can find all elements which somehow contain the word "download" in a website?

[EDIT] This question was marked as a duplicate for a question which gets an answer in which it is suggested to change it to "//*[text()[contains(.,'download')]]". So I tried the following:

>>> from selenium import webdriver
>>> d = webdriver.Firefox()
>>> link = 'https://www.yourticketprovider.nl/LiveContent/tickets.aspx?x=492449&y=8687&px=92AD8EAA22C9223FBCA3102EE0AE2899510C03E398A8A08A222AFDACEBFF8BA95D656F01FB04A1437669EC46E93AB5776A33951830BBA97DD94DB1729BF42D76&rand=a17cafc7-26fe-42d9-a61a-894b43a28046&utm_source=PurchaseSuccess&utm_medium=Email&utm_campaign=SystemMails'
>>> d.get(link)
>>> d.find_elements_by_xpath("//*[text()[contains(.,'download')]]")
[]  # As you can see it still doesn't get any results..
>>>

Does anybody know how I can get all elements on which the user can click and which contain the word "download" in either the link text, the button text, the element id, the element class or the href? All tips are welcome!

709

asked Nov 20 '15 17:11

kramer65

3 Answers

Try this:

//*[(@id|@class|@href|text())
       [contains(translate(.,'DOWNLOAD','download'), 'download')]]

This Xpath 1.0 expression selects: all elements that have an id or class or href attribute or text-node child, whose string value contains the string "download: in any capitalization.

Here is a running proof. The XSLT transformation below is used to evaluate the XPath expression and to copy all selected nodes to the output:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:template match="/">
    <xsl:copy-of select=
    "//*[(@id|@class|@href|text())
       [contains(translate(.,'DOWNLOAD','download'), 'download')]]
    "/>
  </xsl:template>
</xsl:stylesheet>

When we apply the transformation to the following test-document:

<html>
  <a id="downloadTop" class="navlink" 
    href="javascript:__doPostBack('downloadTop','')">Download</a>
  <b id="y" class="x_downLoad"/>
  <p>Nothing to do_wnLoad</p>
  <a class="m" href="www.DownLoad.com">Get it!</a>
  <b>dOwnlOad</b>
</html>

The wanted elements are selected and then copied to the output:

<a id="downloadTop" class="navlink" href="javascript:__doPostBack('downloadTop','')">Download</a>
<b id="y" class="x_downLoad"/>
<a class="m" href="www.DownLoad.com">Get it!</a>
<b>dOwnlOad</b>

131

answered Oct 02 '22 03:10

Dimitre Novatchev

Since you need a case-insensitive match and the XPath 1.0 does not support it - you'll have to use translate() function. Plus, since you need a wildcard match - you need to use contains(). And, since you also want to check the id, class and href attributes, as well as a text:

from selenium import webdriver

driver = webdriver.Firefox()
driver.get("https://www.yourticketprovider.nl/LiveContent/tickets.aspx?x=492449&y=8687&px=92AD8EAA22C9223FBCA3102EE0AE2899510C03E398A8A08A222AFDACEBFF8BA95D656F01FB04A1437669EC46E93AB5776A33951830BBA97DD94DB1729BF42D76&rand=a17cafc7-26fe-42d9-a61a-894b43a28046&utm_source=PurchaseSuccess&utm_medium=Email&utm_campaign=SystemMails")

condition = "contains(translate(%s, 'DOWNLOAD', 'download'), 'download')"
things_to_check = ["text()", "@class", "@id", "@href"]
conditions = " or ".join(condition % thing for thing in things_to_check)

for elm in driver.find_elements_by_xpath("//*[%s]" % conditions):
    print(elm.text)

Here we are basically constructing the expression via string formatting and concatenation, making a case insensitive checks for text(), class, id and href attributes and joining the conditions with or.

answered Oct 02 '22 02:10

alecxe

Well, the answer you found already tells you how to do what you want. The problem I see is that text = 'download' starts with lower case while the text in <a id="downloadTop" class="navlink" href="javascript:__doPostBack('downloadTop','')">Download</a> starts with upper case.

Start by changing your text to text = 'Download' and see if it finds your element now. If that was the problem then you can use a little trick like

text = 'ownload'

driver.find_elements_by_xpath("(//*[contains(text(), '" + text + "')] | //*[@value='" + text + "'])")

to ignore the first character.

EDIT: Yes you can make it case insensitive.

driver.find_elements_by_xpath("(//*[contains(translate(text(), 'DOWNLOAD', 'download'), 'download')])")

answered Oct 02 '22 01:10

Pablo Miranda

Related questions
                            
                                Will a script tag appended to an element run before the next line after .append()?
                            
                                Node-firebird sequentially select
                            
                                Adding host attribute to generated DOM elements
                            
                                How to map onPress events on single svg object in React Native?
                            
                                Where does the naming convention "@@/" (at-at-slash) come from?
                            
                                Good Practice: How can I ensure a JavaScript constructor has access to mixin functions?
                            
                                Why is window.name cached?
                            
                                Google Sign-in does nothing when 3rd party cookies are disabled
                            
                                Chrome autofill does not trigger validation. Standard autofill event? [duplicate]
                            
                                Getting text area and checkboxes to display with radio buttons
                            
                                How to debug Express app launched by nodemon via Gulpfile in WebStorm 10?
                            
                                Change input -> change checkbox model without click
                            
                                How can I manually dispatch React Synthetic Events?
                            
                                How to remove un-parenthesized URLs from a <textarea>
                            
                                How to get current zoom level in openstreetmap?
                            
                                JavaScript: Decrypt content of GnuPG encrypted files using openpgp.js
                            
                                Random Colors with preference
                            
                                node.js istanbul and jasmine setup
                            
                                On window resize event javascript on object tag
                            
                                {{#each this}} in handlebars doesn't work if "this" is empty string

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to find all elements containing the word "download" using Selenium x-path?

Tags:

python

javascript

html

selenium

xpath