Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Jsoup: how to get an image's absolute url?

Tags:

jsoup

Is there a way in jsoup to extract an image absolute url, much like one can get a link's absolute url?

Consider the following image element found in http://www.example.com/

<img src="images/chicken.jpg" width="60px" height="80px"> 

I would like to get http://www.example.com/images/chicken.jpg. What should I do?

like image 666
r0u1i Avatar asked Feb 02 '11 13:02

r0u1i


People also ask

Can we use XPath in Jsoup?

With XPath expressions it is able to select the elements within the HTML using Jsoup as HTML parser.

What does Jsoup parse do?

jsoup can parse HTML files, input streams, URLs, or even strings. It eases data extraction from HTML by offering Document Object Model (DOM) traversal methods and CSS and jQuery-like selectors. jsoup can manipulate the content: the HTML element itself, its attributes, or its text.


2 Answers

Once you have the image element, e.g.:

Element image = document.select("img").first(); String url = image.absUrl("src"); // url = http://www.example.com/images/chicken.jpg 

Alternatively:

String url = image.attr("abs:src"); 

Jsoup has a builtin absUrl() method on all nodes to resolve an attribute to an absolute URL, using the base URL of the node (which could be different from the URL the document was retrieved from).

See also the Working with URLs jsoup documentation.

like image 114
Jonathan Hedley Avatar answered Sep 23 '22 21:09

Jonathan Hedley


Document doc = Jsoup.connect("www.abc.com").get(); Elements img = doc.getElementsByTag("img"); for (Element el : img) {     String src = el.absUrl("src");     System.out.println("Image Found!");     System.out.println("src attribute is : "+src);     getImages(src); } 
like image 22
Gaurab Pradhan Avatar answered Sep 24 '22 21:09

Gaurab Pradhan