Scraping with Nokogiri and Ruby before and after JavaScript changes the value

Question

I have a program that scrapes value from https://web.apps.markit.com/WMXAXLP?YYY2220_zJkhPN/sWPxwhzYw8K4DcqW07HfIQykbYMaXf8fTzWT6WKnuivTcM0W584u1QRwj

My current code is:

doc = Nokogiri::HTML(open(source_url))

puts doc.css('span.indexDate').text
date = doc.css('span.indexDate').text
date = Date.parse(date)
puts date
values = doc.css('table#CdsIndexTable td.col2 span')
puts values

This scrapes the date and values of the second column from the "CDS Indexes" table correctly which is fine. Now, I want to scrape the similar values from the "Bond Indexes" table where I am facing the problem.

I can see a JavaScript function changes it without loading the page and without changing the URL of the page. The difference between these two tables is their IDs are different which is exactly that it should be. But, unfortunately when I try with:

values = doc.css('table#BondIndexTable')
puts values

I get nothing from the Bond Indexes table. But I get values from CDS Indexes table if I use:

values = doc.css('table#CdsIndexTable')
puts values

How can I get the values from both tables?

Winston Kotzan · Accepted Answer

You can use Capybara with the Poltergeist driver to execute the Javascript and format the page. Poltergeist is a wrapper for the PhantomJS headless browser. Here's an example of how you can do it:

require 'rubygems'
require 'capybara'
require 'capybara/dsl'
require 'capybara/poltergeist'

Capybara.default_driver = :poltergeist
Capybara.run_server = false

module GetPrice
  class WebScraper
    include Capybara::DSL

    def get_page_data(url)
      visit(url)
      doc = Nokogiri::HTML(page.html)
      doc.css('td.col2 span')
    end
  end
end

scraper = GetPrice::WebScraper.new
puts scraper.get_page_data('https://web.apps.markit.com/WMXAXLP?YYY2220_zJkhPN/sWPxwhzYw8K4DcqW07HfIQykbYMaXf8fTzWT6WKnuivTcM0W584u1QRwj').map(&:text).inspect

Visit here for a complete example using Amazon.com: https://github.com/wakproductions/amazon_get_price/blob/master/getprice.rb

Scraping with Nokogiri and Ruby before and after JavaScript changes the value

Tags:

javascript

ruby

web-scraping

nokogiri

K M Rakibul Islam

1 Answers

Winston Kotzan

Recent Activity

Donate For Us

Scraping with Nokogiri and Ruby before and after JavaScript changes the value

Tags:

javascript

ruby

web-scraping

nokogiri

K M Rakibul Islam

1 Answers

Winston Kotzan

Related questions

Recent Activity

Donate For Us