Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java process hanging on IOUtils. Suspected deadlock

I have a java process that is hanging in a call to IOUtils.toString with the following code:

String html = "";
try {
    html = IOUtils.toString(someUrl.openStream(), "utf-8"); // process hangs on this line
} catch (Exception e) {
    return null;
}

It can't reproduce this reliably. It's part of a web crawler and so executes this line thousands of times successfully but ultimately causes the process to hang here after a few days.

Output from jstack:

2013-09-25 09:09:36
Full thread dump OpenJDK 64-Bit Server VM (20.0-b12 mixed mode):

"Attach Listener" daemon prio=10 tid=0x00007f2b1c001000 nid=0x225a waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Thread-0" prio=10 tid=0x00007f2b34122000 nid=0x187b runnable [0x00007f2b30970000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:146)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
        - locked <0x00000000e3d2d160> (a java.io.BufferedInputStream)
        at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552)
        at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609)
        at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696)
        - locked <0x00000000e3d30558> (a sun.net.www.http.ChunkedInputStream)
        at java.io.FilterInputStream.read(FilterInputStream.java:133)
        at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2582)
        at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282)
        at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324)
        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176)
        - locked <0x00000000e3d317d0> (a java.io.InputStreamReader)
        at java.io.InputStreamReader.read(InputStreamReader.java:184)
        at java.io.Reader.read(Reader.java:140)
        at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1364)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1340)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1315)
        at org.apache.commons.io.IOUtils.toString(IOUtils.java:525)

I can't see any way to set a timeout on the toString method. Any suggestions? Is this a bug in Apache commons? Or in my OpenJDK perhaps?

like image 808
matt burns Avatar asked Sep 25 '13 09:09

matt burns


2 Answers

Your call to toString() is ultimately forwarded to copyLarge(). Here you can see that reading from the stream is continued until an end of file (EOF) marker is detected by InputStream.read(). According to this post read() can read 0 bytes, i.e., if the URLConnection your reading from does not return an EOF marker the method keeps probably reading 0 bytes forever.

Maybe you can track down which URL causes the problem?

Anyways, to realize a timeout you could start each reading in a separate thread and kill that thread after a certain time elapsed.

like image 93
Sven Amann Avatar answered Oct 29 '22 17:10

Sven Amann


I've decided to try simply using guava IO instead since it was already in my classpath anyway:

String html = "";
try {
    InputSupplier<? extends InputStream> supplier = Resources
            .newInputStreamSupplier(metaUrl);
    html = CharStreams.toString(CharStreams.newReaderSupplier(supplier,
            Charsets.UTF_8));
} catch (Exception e) {
    return null;
}

It generally takes a few days to crash so if I don't update this answer in a few days, assume this worked!

Update : 7 days so far without hanging... :)

like image 27
matt burns Avatar answered Oct 29 '22 16:10

matt burns