I have a Ruby application using Selenium Webdriver and Nokogiri. I want to choose a class, and then for each div corresponding to that class, I want to perform an action based on the contents of the div.
For example, I'm parsing the following page:
https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=puppies
It's a page of search results, and I'm looking for the first result with the word "Adoption" in the description. So the bot should look for divs with className: "result"
, for each one check if its .description
div contains the word "adoption", and if it does, click on the .link
div. In other words, if the .description
does not include that word, then the bot moves on to the next .result
.
This is what I have so far, which just clicks on the first result:
require "selenium-webdriver"
require "nokogiri"
driver = Selenium::WebDriver.for :chrome
driver.navigate.to "https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=puppies"
driver.find_element(:class, "link").click
You can get list of elements that contains "adopt" and "Adopt" by XPath using contains() then use union operator (|) to union results from "adopt" and "Adopt". See code below:
driver = Selenium::WebDriver.for :chrome
driver.navigate.to "https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=puppies"
sleep 5
items = driver.find_elements(:xpath,"//div[@class='g']/div[contains(.,'Adopt')]/h3/a|//div[@class='g']/div[contains(.,'adopt')]/h3/a")
for element in items
linkText = element.text
print linkText
element.click
end
The pattern to handle each iteration will be determined by the type of action executed on each item. If the action is a click, then you can't list all the links to click on each of them since the first click will load a new page, making the elements list obsolete. So If you wish to click on each link, then one way is to use an XPath containing the position of the link for each iteration:
# iteration 1
driver.find_element(:xpath, "(//h3[@class='r']/a)[1]").click # click first link
# iteration 2
driver.find_element(:xpath, "(//h3[@class='r']/a)[2]").click # click second link
Here is an example that clicks on each link from a result page:
require 'selenium-webdriver'
driver = Selenium::WebDriver.for :chrome
wait = Selenium::WebDriver::Wait.new(timeout: 10000)
driver.navigate.to "https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=puppies"
# define the xpath
search_word = "Puppies"
xpath = ("(//h3[@class='r']/a[contains(.,'%s')]" % search_word) + ")[%s]"
# iterate each result by inserting the position in the XPath
i = 0
while true do
# wait for the results to be loaded
wait.until {driver.find_elements(:xpath, "(//h3[@class='r']/a)[1]").any?}
# get the next link
link = driver.find_elements(:xpath, xpath % [i+=1]).first
break if !link
# click the link
link.click
# wait for a new page
wait.until {driver.find_elements(:xpath, "(//h3[@class='r']/a)[1]").empty?}
# handle the new page
puts "Page #{i}: " + driver.title
# return to the main page
driver.navigate.back
end
puts "The end!"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With