Get background image with Nokogiri from DOM?

Question

I'm scraping a site and I can't get the images, because they are loaded with background-image CSS.

Is there a way to get these attributes with Nokogiri without having to use Phantom.js or Sentinel? The background-image actually uses inline-styles so I should be able to.

I have to get images from an array of URLS:

<div class="zoomLens" style="background-image: url(http://resources1.okadirect.com/assets/en/new/catalogue/1200x1200/EHD005MET-L_01.jpg?version=7); background-position: -14.7368421052632px -977.894736842105px; background-repeat: no-repeat;">&nbsp;</div>

I'm using Nokogiri via Mechanize, but don't know how to write this correctly:

image = agent.get(doc.parser.at('.zoomLens')["background-image"]).save("okaimages/f_deco-#{counter}.jpg")

the Tin Man · Accepted Answer

I'd use something like:

require 'nokogiri'

doc = Nokogiri::HTML('<div class="zoomLens" style="background-image: url(http://resources1.okadirect.com/assets/en/new/catalogue/1200x1200/EHD005MET-L_01.jpg?version=7); background-position: -14.7368421052632px -977.894736842105px; background-repeat: no-repeat;">&nbsp;</div>')

doc.search('.zoomLens').map{ |n| n['style'][/url$(.+)$/, 1] }
# => ["http://resources1.okadirect.com/assets/en/new/catalogue/1200x1200/EHD005MET-L_01.jpg?version=7"]

The trick is the appropriate pattern to grab the contents of the parenthesis. n['style'][/url$(.+)$/, 1] is using String#[] which can take a regular expression with grouping, and return a particular group from the captures. See https://www.regex101.com/r/mV6rY6/1 for a breakdown of what its doing.

At that point you'd be sitting on an array of image URLs. You can easily iterate over the list and use OpenURI or any number of other HTTP clients to retrieve the images.

Get background image with Nokogiri from DOM?

Tags:

html

ruby

nokogiri

Gibson

1 Answers

the Tin Man

Recent Activity

Donate For Us

Get background image with Nokogiri from DOM?

Tags:

html

ruby

nokogiri

Gibson

1 Answers

the Tin Man

Related questions

Recent Activity

Donate For Us