Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find content of href link and URL in Java

Tags:

java

href

I want to parse this link :

<a href="http://www.google.fr">Link to google</a>

In order to get two results:

Link = "http://www.google.fr"
LinkName = "Link to google"

I really don't know how to do this, is there a library in Java to solve this problem ?

Thanks in advance,

like image 234
Thordax Avatar asked Apr 24 '12 15:04

Thordax


1 Answers

Use jsoup parser:

example:

File input = new File("/tmp/input.html");
Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/");

Element content = doc.getElementById("content");
Elements links = content.getElementsByTag("a");
for (Element link : links) {
    String linkHref = link.attr("href");
  String linkText = link.text();
}
like image 52
Nurlan Avatar answered Oct 17 '22 10:10

Nurlan