How could I use Jsoup to extract specification data from this website separately for each row e.g. Network->Network Type, Battery etc.
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class mobilereviews {
public static void main(String[] args) throws Exception {
Document doc = Jsoup.connect("http://mobilereviews.net/details-for-Motorola%20L7.htm").get();
for (Element table : doc.select("table")) {
for (Element row : table.select("tr")) {
Elements tds = row.select("td");
System.out.println(tds.get(0).text());
}
}
}
}
Here is an attempt to find the solution to your problem
Document doc = Jsoup.connect("http://mobilereviews.net/details-for-Motorola%20L7.htm").get();
for (Element table : doc.select("table[id=phone_details]")) {
for (Element row : table.select("tr:gt(2)")) {
Elements tds = row.select("td:not([rowspan])");
System.out.println(tds.get(0).text() + "->" + tds.get(1).text());
}
}
Parsing the HTML is tricky and if the HTML changes your code needs to change as well.
You need to study the HTML markup to come up with your parsing rules first.
table[id=phone_details]tr:gt(2)td:not([rowspan])For more complex options in the selector syntax, look here http://jsoup.org/cookbook/extracting-data/selector-syntax
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With