I have this html code that I need to parse
<a class="sushi-restaurant" href="/greatSushi">Best Sushi in town</a>
I know there's an example for jsoup that you can get all links in a page,e.g.
Elements links = doc.select("a[href]");
for (Element link : links) {
print(" * a: <%s> (%s)", link.attr("abs:href"),
trim(link.text(), 35));
}
but I need a piece of code that can return me the href for that specific class.
Thanks guys
You can select elements by class. This example finds elements with the class sushi-restaurant
, then gets the absolute URL of the first result.
Make sure that when you parse the HTML, you specify the base URL (where the document was fetched from) to allow jsoup to determine what the absolute URL of a link is.
public static void main(String[] args) {
String html = "<a class=\"sushi-restaurant\" href=\"/greatSushi\">Best Sushi in town</a>";
Document doc = Jsoup.parse(html, "http://example.com/");
// find all <a class="sushi-restaurant">...
Elements links = doc.select("a.sushi-restaurant");
Element link = links.first();
// 'abs:' makes "/greatsushi" = "http://example.com/greatsushi":
String url = link.attr("abs:href");
System.out.println("url = " + url);
}
Shorter version:
String url = doc.select("a.sushi-restaurant").first().attr("abs:href");
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With