My code goes like this:
URL url; URLConnection uc; StringBuilder parsedContentFromUrl = new StringBuilder(); String urlString="http://www.example.com/content/w2e4dhy3kxya1v0d/"; System.out.println("Getting content for URl : " + urlString); url = new URL(urlString); uc = url.openConnection(); uc.connect(); uc.getInputStream(); BufferedInputStream in = new BufferedInputStream(uc.getInputStream()); int ch; while ((ch = in.read()) != -1) { parsedContentFromUrl.append((char) ch); } System.out.println(parsedContentFromUrl);
However when I am trying to access the URL through browser there is no problem , but when I try to access it through a java program, it throws expection:
java.io.IOException: Server returned HTTP response code: 403 for URL
What is the solution?
Add the code below in between uc.connect();
and uc.getInputStream();
:
uc = url.openConnection(); uc.addRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)");
However, it a nice idea to just allow certain types of user agents. This will keep your website safe and bandwidth usage low.
Some possible bad 'User Agents' you might want to block from your server depending if you don't want people leeching your content and bandwidth. But, user agent can be spoofed as you can see in my example above.
403 means forbidden. From here:-
10.4.4 403 Forbidden
The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity. If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead.
You need to contact the owner of the site to make sure the permissions are set properly.
EDIT I see your problem. I ran the URL through Fiddler. I noticed that I am getting a 407 which means below. This should help you go in the right direction.
10.4.8 407 Proxy Authentication Required
This code is similar to 401 (Unauthorized), but indicates that the client must first authenticate itself with the proxy. The proxy MUST return a Proxy-Authenticate header field (section 14.33) containing a challenge applicable to the proxy for the requested resource. The client MAY repeat the request with a suitable Proxy-Authorization header field (section 14.34). HTTP access authentication is explained in "HTTP Authentication: Basic and Digest Access Authentication"
Also see this relevant question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With