Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if a URL exists or returns 404 with Java?

String urlString = "http://www.nbc.com/Heroes/novels/downloads/Heroes_novel_001.pdf"; URL url = new URL(urlString); if(/* Url does not return 404 */) {     System.out.println("exists"); } else {     System.out.println("does not exists"); } urlString = "http://www.nbc.com/Heroes/novels/downloads/Heroes_novel_190.pdf"; url = new URL(urlString); if(/* Url does not return 404 */) {     System.out.println("exists"); } else {     System.out.println("does not exists"); } 

This should print

exists does not exists 

TEST

public static String URL = "http://www.nbc.com/Heroes/novels/downloads/";  public static int getResponseCode(String urlString) throws MalformedURLException, IOException {     URL u = new URL(urlString);      HttpURLConnection huc =  (HttpURLConnection)  u.openConnection();      huc.setRequestMethod("GET");      huc.connect();      return huc.getResponseCode(); }  System.out.println(getResponseCode(URL + "Heroes_novel_001.pdf"));  System.out.println(getResponseCode(URL + "Heroes_novel_190.pdf"));    System.out.println(getResponseCode("http://www.example.com"));  System.out.println(getResponseCode("http://www.example.com/junk"));            

Output

200
200
200
404

SOLUTION

Add the next line before .connect() and the output would be 200, 404, 200, 404

huc.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)"); 
like image 971
Sergio del Amo Avatar asked Sep 04 '09 09:09

Sergio del Amo


People also ask

How do you check if a URL is working or not in Java?

Using a GET Request After that, we simply open the connection and get the response code: URL url = new URL("http://www.example.com"); HttpURLConnection huc = (HttpURLConnection) url. openConnection(); int responseCode = huc. getResponseCode(); Assert.

How do you check if a URL exists or not?

Existence of an URL can be checked by checking the status code in the response header. The status code 200 is Standard response for successful HTTP requests and status code 404 means URL doesn't exist. Used Functions: get_headers() Function: It fetches all the headers sent by the server in response to the HTTP request.


1 Answers

You may want to add

HttpURLConnection.setFollowRedirects(false); // note : or //        huc.setInstanceFollowRedirects(false) 

if you don't want to follow redirection (3XX)

Instead of doing a "GET", a "HEAD" is all you need.

huc.setRequestMethod("HEAD"); return (huc.getResponseCode() == HttpURLConnection.HTTP_OK); 
like image 111
RealHowTo Avatar answered Sep 21 '22 10:09

RealHowTo