Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Jsoup get href within a class

I have this html code that I need to parse <a class="sushi-restaurant" href="/greatSushi">Best Sushi in town</a>

I know there's an example for jsoup that you can get all links in a page,e.g.

Elements links = doc.select("a[href]");
for (Element link : links) {
print(" * a: <%s>  (%s)", link.attr("abs:href"),
trim(link.text(), 35));
}

but I need a piece of code that can return me the href for that specific class.

Thanks guys

like image 852
Reza Avatar asked Jul 26 '11 11:07

Reza


1 Answers

You can select elements by class. This example finds elements with the class sushi-restaurant, then gets the absolute URL of the first result.

Make sure that when you parse the HTML, you specify the base URL (where the document was fetched from) to allow jsoup to determine what the absolute URL of a link is.

public static void main(String[] args) {
    String html = "<a class=\"sushi-restaurant\" href=\"/greatSushi\">Best Sushi in town</a>";
    Document doc = Jsoup.parse(html, "http://example.com/");
    // find all <a class="sushi-restaurant">...
    Elements links = doc.select("a.sushi-restaurant");
    Element link = links.first();
    // 'abs:' makes "/greatsushi" = "http://example.com/greatsushi":
    String url = link.attr("abs:href");
    System.out.println("url = " + url);
}

Shorter version:

String url = doc.select("a.sushi-restaurant").first().attr("abs:href");

Hope this helps!

like image 101
Jonathan Hedley Avatar answered Nov 19 '22 11:11

Jonathan Hedley