Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

web scraping java beginner [closed]

I am new to Java, I would like to become really good in web scraping and parsing data

Are there any sites related to web scraping that would help me understand the how the APIs like htmcleaner, web-harvest, htmlparser work??

I'm still not proficient enough in Java to look at their Javadocs and understand how all their methods work, and cannot find Java code examples(tutorials) on the web that would help me.

like image 336
user807593 Avatar asked Jun 22 '11 20:06

user807593


People also ask

Can you do web scraping with Java?

Yes. There are many powerful Java libraries used for web scraping. Two such examples are JSoup and HtmlUnit. These libraries help you connect to a web page and offer many methods to extract the desired information.

Why is web scraping not allowed?

So is it legal or illegal? Web scraping and crawling aren't illegal by themselves. After all, you could scrape or crawl your own website, without a hitch. Startups love it because it's a cheap and powerful way to gather data without the need for partnerships.

Do hackers use web scraping?

For this purpose smart web scraping is your number one growth hacker tool. Developing strong, reliable leads has always been a key feature of web scraping, and it's as simple as understanding where your target audience is active online and scraping those sites for specific information.


2 Answers

Why don't you try with this library: JSoup?

The cookbook introduction is a good place where to start or you can go straight to the other specific code examples if you prefer.

like image 194
Marsellus Wallace Avatar answered Oct 01 '22 09:10

Marsellus Wallace


Have you tried using the examples at:

  • http://htmlcleaner.sourceforge.net/javause.php
  • http://web-harvest.sourceforge.net/usage.php
  • http://chasethedevil.blogspot.com/2006/05/java-html-parsing-example-with.html

Maybe those can be of some help?

like image 32
aemus Avatar answered Oct 01 '22 09:10

aemus