Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting 404 while accessing a webpage through jsoup

Tags:

jsoup

I am getting 404 while accessing a webpage though jsoup. But the page loads fine when accesses through a browser.

I was able to access the page through jsoup few days back. But now it throws 404. Tried to add User-Agent, timeout etc. but no luck.

In Firebug as well, I am getting 404 for the request, but the page loads fine in the browser.

Not sure how the page gets rendered in the browser but not through Java Program.

Document doc = Jsoup.connect("http://example.com/stock.php?"+quote).userAgent("Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36")
             .timeout(1000*7).get();

On executing the Java program, getting below error:

org.jsoup.HttpStatusException: HTTP error fetching URL. Status=404, URL=http://example.com/stock.php?AAA
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:537)

Please let me know if more information is required.

like image 680
realtime Avatar asked Apr 09 '16 04:04

realtime


Video Answer


1 Answers

By default Jsoup throws an exception when it receives an HTTP error. You can set the ignoreHttpErrors to true to read the page contents even if the page returned an error.

Document doc = Jsoup
                 .connect("http://example.com/stock.php?"+quote)
                 .userAgent("...")
                 .timeout(1000*7)
                 .ignoreHttpErrors(true) 
                 .get();
like image 111
nyname00 Avatar answered Sep 17 '22 23:09

nyname00