Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can Nokogiri interpret javascript? - Web Scraping

We are trying to scrape the availabilities on this page: http://www.equityapartments.com/new-york/new-york-city-apartments/midtown-west/mantena-apartments.aspx

I need to use my spider to select on the "All Floorplans" and fetch all the availabilities. But the data are actually sent through Javascript request I believe. Is there a way for my Nokogiri spider to render it? Or maybe simulate the process of clicking on buttons?

like image 344
AlexWang Avatar asked Dec 09 '22 03:12

AlexWang


1 Answers

Nokogiri is just a parser. It also allows to search content.

To interact with web pages you need to use something else, e.g. Watir and PhantomJS.

Combining them all together:

browser = Watir::Browser.new(:phantomjs)

browser.goto(your_url_above)
browser.link(text: 'All floorplans').click

document = Nokogiri::HTML(browser.html)
document.search(...)
like image 75
Maxim Avatar answered Dec 22 '22 22:12

Maxim