Java: Reading a pdf file from URL into Byte array/ByteBuffer in an applet

Tags:

I'm trying to figure out why this particular snippet of code isn't working for me. I've got an applet which is supposed to read a .pdf and display it with a pdf-renderer library, but for some reason when I read in the .pdf files which sit on my server, they end up as being corrupt. I've tested it by writing the files back out again.

I've tried viewing the applet in both IE and Firefox and the corrupt files occur. Funny thing is, when I trying viewing the applet in Safari (for Windows), the file is actually fine! I understand the JVM might be different, but I am still lost. I've compiled in Java 1.5. JVMs are 1.6. The snippet which reads the file is below.

public static ByteBuffer getAsByteArray(URL url) throws IOException {
        ByteArrayOutputStream tmpOut = new ByteArrayOutputStream();

        URLConnection connection = url.openConnection();
        int contentLength = connection.getContentLength();
        InputStream in = url.openStream();
        byte[] buf = new byte[512];
        int len;
        while (true) {
            len = in.read(buf);
            if (len == -1) {
                break;
            }
            tmpOut.write(buf, 0, len);
        }
        tmpOut.close();
        ByteBuffer bb = ByteBuffer.wrap(tmpOut.toByteArray(), 0,
                                        tmpOut.size());
        //Lines below used to test if file is corrupt
        //FileOutputStream fos = new FileOutputStream("C:\\abc.pdf");
        //fos.write(tmpOut.toByteArray());
        return bb;
}

I must be missing something, and I've been banging my head trying to figure it out. Any help is greatly appreciated. Thanks.

Edit: To further clarify my situation, the difference in the file before I read then with the snippet and after, is that the ones I output after reading are significantly smaller than they originally are. When opening them, they are not recognized as .pdf files. There are no exceptions being thrown that I ignore, and I have tried flushing to no avail.

This snippet works in Safari, meaning the files are read in it's entirety, with no difference in size, and can be opened with any .pdf reader. In IE and Firefox, the files always end up being corrupted, consistently the same smaller size.

I monitored the len variable (when reading a 59kb file), hoping to see how many bytes get read in at each loop. In IE and Firefox, at 18kb, the in.read(buf) returns a -1 as if the file has ended. Safari does not do this.

I'll keep at it, and I appreciate all the suggestions so far.

637

asked Mar 12 '09 01:03

Pol

1 Answers

Just in case these small changes make a difference, try this:

public static ByteBuffer getAsByteArray(URL url) throws IOException {
    URLConnection connection = url.openConnection();
    // Since you get a URLConnection, use it to get the InputStream
    InputStream in = connection.getInputStream();
    // Now that the InputStream is open, get the content length
    int contentLength = connection.getContentLength();

    // To avoid having to resize the array over and over and over as
    // bytes are written to the array, provide an accurate estimate of
    // the ultimate size of the byte array
    ByteArrayOutputStream tmpOut;
    if (contentLength != -1) {
        tmpOut = new ByteArrayOutputStream(contentLength);
    } else {
        tmpOut = new ByteArrayOutputStream(16384); // Pick some appropriate size
    }

    byte[] buf = new byte[512];
    while (true) {
        int len = in.read(buf);
        if (len == -1) {
            break;
        }
        tmpOut.write(buf, 0, len);
    }
    in.close();
    tmpOut.close(); // No effect, but good to do anyway to keep the metaphor alive

    byte[] array = tmpOut.toByteArray();

    //Lines below used to test if file is corrupt
    //FileOutputStream fos = new FileOutputStream("C:\\abc.pdf");
    //fos.write(array);
    //fos.close();

    return ByteBuffer.wrap(array);
}

You forgot to close fos which may result in that file being shorter if your application is still running or is abruptly terminated. Also, I added creating the ByteArrayOutputStream with the appropriate initial size. (Otherwise Java will have to repeatedly allocate a new array and copy, allocate a new array and copy, which is expensive.) Replace the value 16384 with a more appropriate value. 16k is probably small for a PDF, but I don't know how but the "average" size is that you expect to download.

Since you use toByteArray() twice (even though one is in diagnostic code), I assigned that to a variable. Finally, although it shouldn't make any difference, when you are wrapping the entire array in a ByteBuffer, you only need to supply the byte array itself. Supplying the offset 0 and the length is redundant.

Note that if you are downloading large PDF files this way, then ensure that your JVM is running with a large enough heap that you have enough room for several times the largest file size you expect to read. The method you're using keeps the whole file in memory, which is OK as long as you can afford that memory. :)

141

answered Oct 23 '22 22:10

Eddie

Related questions
                            
                                org.openqa.selenium.remote.internal.ApacheHttpClient is deprecated in Selenium 3.14.0 - What should be used instead?
                            
                                Java - how to generate Swagger UI directly from openapi 3.0 specification
                            
                                Can I use any version of Java SE with Java EE 8?
                            
                                Reload/Refresh cache in spring boot
                            
                                Catching generic Exception in a toString implementation - bad practice?
                            
                                Java generic method cannot call another generic method with looser constraint and return its value
                            
                                Difference between "Step Into" and "Force Step Into" in the Intellij debugger?
                            
                                Java Stream: Filter with multiple ranges
                            
                                How to convert short[] into List<Short> in Java with streams?
                            
                                How i can install junit 5 on VSCode
                            
                                kubernetes pod memory - java gc logs
                            
                                What is the difference between a final and a non-sealed class in Java 15's sealed-classes feature?
                            
                                Do Static Methods and Fields take up memory in an instance of the class they are defined in?
                            
                                Java 8 date/time type `java.time.Instant` not supported by default Issue : [duplicate]
                            
                                Building with Gradle - Supplied javaHome must be a valid directory
                            
                                Does having many unused beans in a Spring Bean Context waste significant resources?
                            
                                Difference between selectitem and selectitems tags
                            
                                Good design: How to pass InputStreams as argument?
                            
                                Can I use two different look and feels in the same Swing application?
                            
                                Anyone know of a java.util.Map implementation optimized for low memory use?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Java: Reading a pdf file from URL into Byte array/ByteBuffer in an applet

Tags:

java

pdf

applet

Pol

People also ask

1 Answers

Eddie

Recent Activity

Donate For Us