Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to scrape this HTML for an android app?

What is the best way to scrape the below HTML from a web page? I want to pull out Apple, Orange and Grape and put them into a dropdown menu in my Android app. Should I use Jsoup for this, and if so, what would be the best way to do it? Should I use Regex instead?

<select name="fruit" id="fruit" >
<option value="APPLE">Apple</option>
<option value="ORANGE">Orange</option>
<option value="GRAPE">Grape</option>
</select>
like image 956
alexD Avatar asked Sep 19 '11 19:09

alexD


2 Answers

Depends, but I'd go with an XML/HTML parser. Don't use regex.

Example with jsoup:

Document doc = Jsoup.connect(someUrl).get();
Elements options = doc.select("select#fruit option");

More on jsoup selector syntax.


Best way?

I would go with either the built-in DOM parser or SAX parser. If you're going to be parsing a large document, SAX is faster. If the document is small, then there's not much difference. More on SAX vs DOM.

like image 57
skyuzo Avatar answered Sep 18 '22 10:09

skyuzo


For HTML parsing you can use jsoup. The usage is very easy and the API is great.

http://jsoup.org/

For me it worked great!

EDIT: too slow :D skyuzo's post is great :)

like image 41
dudeldidadum Avatar answered Sep 20 '22 10:09

dudeldidadum