Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java: How to read content from redirected URLs?

Tags:

java

I use the following Java code in a Bean to read a URL's content:

String url;
String inputLine;
StringBuilder srcCode=new StringBuilder();

public void setUrl (String value) {
    url = value; 
}

private void scanWebPage() throws IOException {
    try {
         URL dest = new URL(url);
         URLConnection yc =  dest.openConnection();
         yc.setUseCaches(false);
         BufferedReader in = new BufferedReader(new 
                        InputStreamReader(yc.getInputStream()));
         while ((inputLine = in.readLine()) != null)
            srcCode = srcCode.append (inputLine);
         in.close();
    } catch (FileNotFoundException fne) {
         srcCode.append("File Not Found") ;
    }
}

The code works fine for most URL's, but does not work for redirected URLs. How can I update the above code to read content from redirected URLs? For redirected URLs, I get "File Not Found".

like image 381
user1492667 Avatar asked Feb 27 '13 11:02

user1492667


1 Answers

Give the following a go:

    HttpURLConnection yc =  (HttpURLConnection) dest.openConnection();
    yc.setInstanceFollowRedirects( true );

In context to your code above:

   `String url = "http://java.sun.com";
    String inputLine;
    StringBuilder srcCode=new StringBuilder();



    URL dest = new URL(url);
    HttpURLConnection yc =  (HttpURLConnection) dest.openConnection();
    yc.setInstanceFollowRedirects( true );
    yc.setUseCaches(false);

    BufferedReader in = new BufferedReader(
        new InputStreamReader(
            yc.getInputStream()));
    while ((inputLine = in.readLine()) != null) {
        srcCode = srcCode.append (inputLine);
    }

    in.close();`

Modified further to help you diagnose what is going on. This code turns off auto redirection and then manually follows the Location headers printing out as it goes along.

@Test
public void f() throws IOException {
    String url = "http://java.sun.com";


    fetchURL(url);
}


private HttpURLConnection fetchURL( String url ) throws IOException {
    URL dest = new URL(url);
    HttpURLConnection yc =  (HttpURLConnection) dest.openConnection();
    yc.setInstanceFollowRedirects( false );
    yc.setUseCaches(false);

    System.out.println( "url = " + url );

    int responseCode = yc.getResponseCode();
    if ( responseCode >= 300 && responseCode < 400 ) { // brute force check, far too wide
        return fetchURL( yc.getHeaderField( "Location") );
    }

    System.out.println( "yc.getResponseCode() = " + yc.getResponseCode() );

    return yc;
}
like image 130
Chris K Avatar answered Oct 13 '22 14:10

Chris K