Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GZIPOutputStream: Increase compression level

java.util.zip.GZIPOutputStream does not provide a constructor argument or a setter for the compression level of its underlying Deflater.

There are ways to work around this issue, as described here, for example:

GZIPOutputStream gzip = new GZIPOutputStream(output) {
    {
        this.def.setLevel(Deflater.BEST_COMPRESSION);
    }
};

I GZIPped a 10G file with this and its size didn't decrease by a single bit compared to using the preset DEFAULT_COMPRESSION.

The answer to this question says that under certain circumstances setting the level might not work as planned. Just to make sure, I also tried to create a new Deflater:

this.def = new Deflater(Deflater.BEST_COMPRESSION, true);

But sill no reduction in file size...

Is there a reason why they did not provide access to the Deflater level?

Or is something wrong with the code sample above?

Does the deflater level work at all?

Edit: Thanks for the comments.

  1. Can the file be compressed any further?

    It's a UTF-8 text file that is compressed from 10G to 10M using Default compression. So without knowing details about the compression levels, I reckoned it could be compressed further.

  2. Time difference between DEFAULT_COMPRESSION and BEST_COMPRESSION?

    I don't have time to create really reliable figures. But I executed the code with each compression level about five times and both take about the same time (2 minutes +/- 5 seconds).

  3. File size with gzip -v9? The file created by gzip is about 15KB smaller than the one created by java. So, for my specific use case it's not worth investigating this topic any further.

However, the three fundamental questions stated above still persist. Anyone ever successfully decreased a file using higher compression levels with GZIPOutputStream?

like image 399
schnatterer Avatar asked Oct 02 '13 13:10

schnatterer


1 Answers

Yes, I increased my data compression ratio slightly using java GZIP util.

class MyGZIPOutputStream 
    extends GZIPOutputStream {

    public MyGZIPOutputStream( OutputStream out ) throws IOException {
        super( out );
    } 

    public void setLevel( int level ) {
        def.setLevel(level);
    }
}

Just wrap it around your stream and set the level as,

new MyGZIPOutputStream( outputstream ).setLevel( Deflater.BEST_COMPRESSION );

Here are the performance results which I tried over 3.2 GB data,

Data Compression ratio before ( which used default compression ) : 1.3823362619139712

Data Compression ratio after ( which used best compression ) : 1.3836412922501984

I know it's not a great improvement but still a progress.

like image 103
Krishna Sundar Avatar answered Sep 21 '22 08:09

Krishna Sundar