I have
<table class="table" >
<tr>
<td><a href="url">text1</a></td>
<td>text2</td>
</tr>
<tr>
<td><a href="url2">text</a></td>
<td>text</td>
</tr>
and I want to extract the url and text of all rows I use
Document doc = Jsoup.connect(url).get();
for (Element table : doc.select("table.table")) {
for (Element row : table.select("tr")) {
Elements tds = row.select("td");
String text1=tds.get(0).text();
String url= row.attr("href");
System.out.println(text1+ "," + url);
}
}
I get the text1 value but url is null.
How can I get the url from the td tags?
Your row variable is not the a
tag, so there is no attribute href
on it.
Try with this:
Element table = doc.select("table.table");
Elements links = table.getElementsByTag("a");
for (Element link: links) {
String url = link.attr("href");
String text = link.text();
System.out.println(text + ", " + url);
}
This is pretty much extracted from the JSoup documentation
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With