Extract links from a web page

1 Answers

download java file as plain text/html pass it through Jsoup or html cleaner both are similar and can be used to parse even malformed html 4.0 syntax and then you can use the popular HTML DOM parsing methods like getElementsByName("a") or in jsoup its even cool you can simply use

File input = new File("/tmp/input.html");  Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/");  Elements links = doc.select("a[href]"); // a with href Elements pngs = doc.select("img[src$=.png]"); // img with src ending .png  Element masthead = doc.select("div.masthead").first();

and find all links and then get the detials using

String linkhref=links.attr("href");

Taken from http://jsoup.org/cookbook/extracting-data/selector-syntax

The selectors have same syntax as jQuery if you know jQuery function chaining then you will certainly love it.

EDIT: In case you want more tutorials, you can try out this one made by mkyong.

http://www.mkyong.com/java/jsoup-html-parser-hello-world-examples/

answered Oct 12 '22 01:10

samarjit samanta

Related questions
                            
                                makefile unexpectedly removes target
                            
                                How to verify number of method calls using OCMock
                            
                                Redirect perror output to fprintf(stderr, " ")
                            
                                Type of conditional expression cannot be determined as
                            
                                Android In-App billing item price
                            
                                The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION
                            
                                Javascript to open popup window and disable parent window
                            
                                Ctrl + Shift + R is not working in Eclipse
                            
                                Why are function pointers not considered object oriented?
                            
                                How to test the update method in Rails
                            
                                Memory layout of union of different sized member?
                            
                                Practical Limits of ElasticSearch + Cassandra

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Extract links from a web page

Tags:

Wassim AZIRAR

People also ask

1 Answers

samarjit samanta

Recent Activity

Donate For Us