ZLib decompression fails on large byte array

Tags:

When experimenting with ZLib compression, I have run across a strange problem. Decompressing a zlib-compressed byte array with random data fails reproducibly if the source array is at least 32752 bytes long. Here's a little program that reproduces the problem, you can see it in action on IDEOne. The compression and decompression methods are standard code picked off tutorials.

public class ZlibMain {

    private static byte[] compress(final byte[] data) {
        final Deflater deflater = new Deflater();
        deflater.setInput(data);

        deflater.finish();
        final byte[] bytesCompressed = new byte[Short.MAX_VALUE];
        final int numberOfBytesAfterCompression = deflater.deflate(bytesCompressed);
        final byte[] returnValues = new byte[numberOfBytesAfterCompression];
        System.arraycopy(bytesCompressed, 0, returnValues, 0, numberOfBytesAfterCompression);
        return returnValues;

    }

    private static byte[] decompress(final byte[] data) {
        final Inflater inflater = new Inflater();
        inflater.setInput(data);
        try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length)) {
            final byte[] buffer = new byte[Math.max(1024, data.length / 10)];
            while (!inflater.finished()) {
                final int count = inflater.inflate(buffer);
                outputStream.write(buffer, 0, count);
            }
            outputStream.close();
            final byte[] output = outputStream.toByteArray();
            return output;
        } catch (DataFormatException | IOException e) {
            throw new RuntimeException(e);
        }
    }

    public static void main(final String[] args) {
        roundTrip(100);
        roundTrip(1000);
        roundTrip(10000);
        roundTrip(20000);
        roundTrip(30000);
        roundTrip(32000);
        for (int i = 32700; i < 33000; i++) {
            if(!roundTrip(i))break;
        }
    }

    private static boolean roundTrip(final int i) {
        System.out.printf("Starting round trip with size %d: ", i);
        final byte[] data = new byte[i];
        for (int j = 0; j < data.length; j++) {
            data[j]= (byte) j;
        }
        shuffleArray(data);

        final byte[] compressed = compress(data);
        try {
            final byte[] decompressed = CompletableFuture.supplyAsync(() -> decompress(compressed))
                                                         .get(2, TimeUnit.SECONDS);
            System.out.printf("Success (%s)%n", Arrays.equals(data, decompressed) ? "matching" : "non-matching");
            return true;
        } catch (InterruptedException | ExecutionException | TimeoutException e) {
            System.out.println("Failure!");
            return false;
        }
    }

    // Implementing Fisher–Yates shuffle
    // source: https://stackoverflow.com/a/1520212/342852
    static void shuffleArray(byte[] ar) {
        Random rnd = ThreadLocalRandom.current();
        for (int i = ar.length - 1; i > 0; i--) {
            int index = rnd.nextInt(i + 1);
            // Simple swap
            byte a = ar[index];
            ar[index] = ar[i];
            ar[i] = a;
        }
    }
}

Is this a known bug in ZLib? Or do I have an error in my compress / decompress routines?

909

asked May 31 '17 11:05

Sean Patrick Floyd

2 Answers

It is an error in the logic of the compress / decompress methods; I am not this deep in the implementations but with debugging I found the following:

When the buffer of 32752 bytes is compressed, the deflater.deflate() method returns a value of 32767, this is the size to which you initialized the buffer in the line:

final byte[] bytesCompressed = new byte[Short.MAX_VALUE];

If you increase the buffer size for example to

final byte[] bytesCompressed = new byte[4 * Short.MAX_VALUE];

the you will see, that the input of 32752 bytes actually is deflated to 32768 bytes. So in your code, the compressed data does not contain all the data which should be in there.

When you then try to decompress, the inflater.inflate()method returns zero which indicates that more input data is needed. But as you only check for inflater.finished() you end in an endless loop.

So you can either increase the buffer size on compressing, but that probably just means haveing the problem with bigger files, or you better need to rewrite to compress/decompress logic to process your data in chunks.

173

answered Nov 07 '22 12:11

P.J.Meisch

Apparently the compress() method was faulty. This one works:

public static byte[] compress(final byte[] data) {
    try (final ByteArrayOutputStream outputStream = 
                                     new ByteArrayOutputStream(data.length);) {

        final Deflater deflater = new Deflater();
        deflater.setInput(data);
        deflater.finish();
        final byte[] buffer = new byte[1024];
        while (!deflater.finished()) {
            final int count = deflater.deflate(buffer);
            outputStream.write(buffer, 0, count);
        }

        final byte[] output = outputStream.toByteArray();
        return output;
    } catch (IOException e) {
        throw new IllegalStateException(e);
    }
}

answered Nov 07 '22 13:11

Sean Patrick Floyd

Related questions
                            
                                Trouble understanding order of execution in a java algorithm (k= and ++k) [duplicate]
                            
                                Check if console supports ANSI escape codes in Java
                            
                                How to securely send/store password in a Spring RESTful login service
                            
                                MALLET topic-inference
                            
                                Removing commercial components for Oracle Java SE
                            
                                Do operations on ThreadLocal have to be synchronized?
                            
                                Unclear intrinsic behaviour in Java9
                            
                                How Jetty handles class loading with same class with different dependencies?
                            
                                Deeplearning4j: Iterations, Epochs, and ScoreIterationListener
                            
                                Check statement for every list item
                            
                                Pass class type as parameter to use in ArrayList?
                            
                                Proper usage of Observable.create() in RxJava 2 (Best Practices)
                            
                                Make Http call using ReactiveX for Java
                            
                                LinkedHashMap entrySet's order not being preserved in a stream (Android)
                            
                                Sort Java ArrayList with Letters before numbers
                            
                                ElasticSearch aggregation with Java
                            
                                In Java9 how can I reflectively load a class if I don't know it's module?
                            
                                What is lub(null, Double)?
                            
                                Debugging internal compiler error (Java) to find offending source code
                            
                                Spring JMS listener acknowledging even when exception

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

ZLib decompression fails on large byte array

Tags:

java

arrays

zlib

Sean Patrick Floyd

People also ask

2 Answers

P.J.Meisch

Sean Patrick Floyd

Recent Activity

Donate For Us