Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java file download hangs

I have a web interface which is used to download a file. When the request comes in, my glassfish server streams the file from a web service and then writes the content to the outputstream. My code works fine except when the file size becomes very large (like more than 200 MB), it hangs showing 0% donwloaded in the browser and the file is never downloaded.

When I move flush() method inside the while loop it works fine for large files as well. I am not sure if putting flush() in a loop is a problem. Not sure how this thing actually works. My code is as follows :

HttpURLConnection conn = (HttpURLConnection) downloadUri.toURL().openConnection();
        conn.setDoOutput(true);
        conn.setRequestMethod("GET");
        conn.setRequestProperty("Content-Type", "application/pdf");
        if (conn.getResponseCode() == 200) {
            ServletOutputStream output;
            try (InputStream inputStream = conn.getInputStream()) {
                HttpServletResponse response = (HttpServletResponse) FacesContext.getCurrentInstance().getExternalContext().getResponse();
                response.setContentType("application/octet-stream");
                response.setHeader("Content-Length", conn.getHeaderField("Content-Length"));
                response.setHeader("Content-Disposition", "attachment; filename=\"" + abbr + ".pdf\"");
                output = response.getOutputStream();
                byte[] buffer = new byte[1024];
                int bytesRead;                    
                while ((bytesRead = inputStream.read(buffer)) != -1) {
                    output.write(buffer, 0, bytesRead);                        
                }
            }                 
            output.flush();
            output.close();

Any thoughts?. Thank you for looking into this.

like image 302
vinay Avatar asked Feb 12 '23 02:02

vinay


1 Answers

The flush() method instructs the stream to actually send the output down the stream pipe.

Various stream implementation can, for various performance reasons, cache the output and not write to the underlying stream right away.

For example to save IO operations on disk which are expensive from a performance point of view.

There is no problem in flushing a stream, if not for performances, which in this case is what you want : the stream seems to be stuck until you flush it, so you want it to actually send stuff to the client.

Maybe you can play with the size of your buffer, with something bigger than 1024, to see what fits better.

EDIT :

The problem of flushing in a loop or not in a loop is relatively not relevant.

You can call flush whenever you want, as said it will call the underlying OS stream, whether this is a performance hit or no depends on the situation.

For example, you could value the 200MB of ram in which the stream is buffering the file more important, also performance-wise, than the IO operation.

Or much more simply value the user experience of seeing the file actually downloading more important than the eventual performance hit you might maybe experience, if you manage to measure it.

As said, the larger is your buffer, the less the problem of the loop is. Suppose, as an extreme example, your buffer is 100 megabyte, then an 80 megabyte file will get only one flush, which it would get anyway at the end of the request.

Having 1k of buffer is probably too small, 4k better, 16k fine, it's a tradeoff between IO calls and RAM consumption.

The stream should do it's proper work itself, if however you're seeing that a 200MB file get's fully cached unless you call flush, then obviously the stream is probably optimizing performances but giving a bad user experience, so obviously you need it in the loop.

like image 130
Simone Gianni Avatar answered Feb 15 '23 09:02

Simone Gianni