Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JSoup randomly throws java.io.IOException: stream is closed when running from browser

Tags:

javafx

jsoup

I'm having some weird JSoup problem when running my JavaFX application from the browser (or as web-start).

When I run from inside the IDE (Eclipse or Netbeans) or as a standalone app, it runs normally. When I try to run as a web-start or from the browser (Chrome), JSoup randomly throws a "java.io.IOException: stream is closed".

The site I'm trying to parse is thepiratebay.sx. When I first run the application (from browser), I get this error. With the application running, if I try to parse again, than it works... sometimes.

The JSoup code:

try {
    //TODO: Change to HttpFetcher. This method is reporting "stream is closed" when running on browser
    Connection con = Jsoup.connect(url)
            .timeout(HTTP_TIMEOUT)
            .userAgent(UserAgentGenerator.getUserAgent())
            .followRedirects(false);
    doc = con.get();
    System.out.println("Fetching... " + url);
} catch (IOException e) {
    e.printStackTrace();
    System.out.println("Parser connect must have timed out, no results. " + url);
    fetchFailed[i] = true;
    continue;
}       
finally {
    i++;
    if (CommonTFUtils.isAllTrue(fetchFailed)) {
        throw new HttpException("Fetcher failed on every URL of " + response.getSite_name());
    }
}

And the exception thrown:

CacheEntry[http://thepiratebay.sx/browse/207/0/7]: updateAvailable=true,lastModified=Tue May 14 14:28:16 BRT 2013,length=-1
java.io.IOException: stream is closed
    at sun.net.www.http.ChunkedInputStream.ensureOpen(Unknown Source)
    at sun.net.www.http.ChunkedInputStream.read(Unknown Source)
    at java.io.FilterInputStream.read(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.close(Unknown Source)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:468)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:410)
    at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:164)
    at org.jsoup.helper.HttpConnection.get(HttpConnection.java:153)
    at com.package.torrent.parser.GenericParser.search(GenericParser.java:147)
    at com.package.torrent.parser.GenericParser.browse(GenericParser.java:82)
    at com.package.search.TrackerSearch.searchTracker(TrackerSearch.java:69)
    at com.package.search.TrackerSearch.searchAllTrackers(TrackerSearch.java:40)
    at com.package.search.TrackerSearch.searchAllTrackers(TrackerSearch.java:23)
    at com.package.search.MovieBrowser.browseTrackers(MovieBrowser.java:49)
    at com.package.ui.browse.BrowseController$MovieBrowserTask.call(BrowseController.java:237)
    at com.package.ui.browse.BrowseController$MovieBrowserTask.call(BrowseController.java:213)
    at javafx.concurrent.Task$TaskCallable.call(Task.java:1259)
    at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

Does anyone have an idea of what might be causing this?

Thanks in advance.

like image 665
Philippe Franklin Avatar asked Nov 03 '22 22:11

Philippe Franklin


1 Answers

I think I found a solution. Place this code before you ever call JSoup. Apparently, applets and web start set this value to true. Now, I wonder why Sun forces you to access a static variable non-statically.

new URL("jar:file://dummy.jar!/").openConnection().setDefaultUseCaches(false);

JSoup doesn't handle well when the URL is cached and treats it as an exception.

like image 57
Skylion Avatar answered Nov 08 '22 07:11

Skylion