Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I perform an action based on the contents of a div with Selenium Webdriver?

I have a Ruby application using Selenium Webdriver and Nokogiri. I want to choose a class, and then for each div corresponding to that class, I want to perform an action based on the contents of the div.

For example, I'm parsing the following page:

https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=puppies

It's a page of search results, and I'm looking for the first result with the word "Adoption" in the description. So the bot should look for divs with className: "result", for each one check if its .description div contains the word "adoption", and if it does, click on the .link div. In other words, if the .description does not include that word, then the bot moves on to the next .result.

This is what I have so far, which just clicks on the first result:

require "selenium-webdriver"
require "nokogiri"
driver = Selenium::WebDriver.for :chrome
driver.navigate.to "https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=puppies"
driver.find_element(:class, "link").click
like image 212
Joe Morano Avatar asked Mar 03 '16 05:03

Joe Morano


2 Answers

You can get list of elements that contains "adopt" and "Adopt" by XPath using contains() then use union operator (|) to union results from "adopt" and "Adopt". See code below:

driver = Selenium::WebDriver.for :chrome
driver.navigate.to "https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=puppies"
sleep 5
items  = driver.find_elements(:xpath,"//div[@class='g']/div[contains(.,'Adopt')]/h3/a|//div[@class='g']/div[contains(.,'adopt')]/h3/a")
for element in items
    linkText = element.text
    print linkText
    element.click
end
like image 152
Buaban Avatar answered Oct 31 '22 16:10

Buaban


The pattern to handle each iteration will be determined by the type of action executed on each item. If the action is a click, then you can't list all the links to click on each of them since the first click will load a new page, making the elements list obsolete. So If you wish to click on each link, then one way is to use an XPath containing the position of the link for each iteration:

# iteration 1
driver.find_element(:xpath, "(//h3[@class='r']/a)[1]").click   # click first link

# iteration 2
driver.find_element(:xpath, "(//h3[@class='r']/a)[2]").click   # click second link

Here is an example that clicks on each link from a result page:

require 'selenium-webdriver'

driver = Selenium::WebDriver.for :chrome
wait = Selenium::WebDriver::Wait.new(timeout: 10000)

driver.navigate.to "https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=puppies"

# define the xpath
search_word = "Puppies"
xpath = ("(//h3[@class='r']/a[contains(.,'%s')]" % search_word) + ")[%s]"

# iterate each result by inserting the position in the XPath
i = 0
while true do

  # wait for the results to be loaded
  wait.until {driver.find_elements(:xpath, "(//h3[@class='r']/a)[1]").any?}

  # get the next link
  link = driver.find_elements(:xpath, xpath % [i+=1]).first
  break if !link

  # click the link
  link.click

  # wait for a new page
  wait.until {driver.find_elements(:xpath, "(//h3[@class='r']/a)[1]").empty?}

  # handle the new page
  puts "Page #{i}: " + driver.title

  # return to the main page
  driver.navigate.back
end

puts "The end!"
like image 41
Florent B. Avatar answered Oct 31 '22 15:10

Florent B.