Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Node.js scraping, converting image src -> full URL

I'm using Cheerio (https://github.com/MatthewMueller/cheerio) to scrape websites and get images for a project I'm working on. I'm wondering if there's an easy way with Node.js (or another package) to convert the $(img).attr('src') to a fully qualified URL? Sometimes I'll get "image.jpg" and other times "../../image.jpg", and other times "//somepath/image.jpg". Perhaps I'm just missing a regex of some sort... Thanks for your time :)

like image 905
ewindsor Avatar asked Oct 26 '12 01:10

ewindsor


1 Answers

Look at the node url module. Specifically url.resolve(from, to) should be what you're looking for.

like image 56
Waylon Flinn Avatar answered Nov 19 '22 23:11

Waylon Flinn