Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GZIP compression to a byte array

Tags:

I am trying to write a class that can compress data. The below code fails (no exception is thrown, but the target .gz file is empty.)
Besides: I don't want to generate the .gz file directly like it is done in all examples. I only want to get the compressed data, so that I can e.g. encrypt it before writting the data to a file.

If I write directly to a file everything works fine:

import java.io.*;
import java.util.zip.*;
import java.nio.charset.*;

public class Zipper
{
  public static void main(String[] args)
  {    
    byte[] dataToCompress = "This is the test data."
      .getBytes(StandardCharsets.ISO_8859_1);

    GZIPOutputStream zipStream = null;
    FileOutputStream fileStream = null;
    try
    {
      fileStream = new FileOutputStream("C:/Users/UserName/Desktop/zip_file.gz");
      zipStream = new GZIPOutputStream(fileStream);
      zipStream.write(dataToCompress);

      fileStream.write(compressedData);
    }
    catch(Exception e)
    {
      e.printStackTrace();
    }
    finally
    {
      try{ zipStream.close(); }
        catch(Exception e){ }
      try{ fileStream.close(); }
        catch(Exception e){ }
    }
  }
}

But, if I want to 'bypass' it to the byte array stream it does not produce a single byte - compressedData is always empty.

import java.io.*;
import java.util.zip.*;
import java.nio.charset.*;

public class Zipper
{
  public static void main(String[] args)
  {    
    byte[] dataToCompress = "This is the test data."
      .getBytes(StandardCharsets.ISO_8859_1);
    byte[] compressedData = null;

    GZIPOutputStream zipStream = null;
    ByteArrayOutputStream byteStream = null;
    FileOutputStream fileStream = null;
    try
    {
      byteStream = new ByteArrayOutputStream(dataToCompress.length);
      zipStream = new GZIPOutputStream(byteStream);
      zipStream.write(dataToCompress);

      compressedData = byteStream.toByteArray();

      fileStream = new FileOutputStream("C:/Users/UserName/Desktop/zip_file.gz");
      fileStream.write(compressedData);
    }
    catch(Exception e)
    {
      e.printStackTrace();
    }
    finally
    {
      try{ zipStream.close(); }
        catch(Exception e){ }
      try{ byteStream.close(); }
        catch(Exception e){ }
      try{ fileStream.close(); }
        catch(Exception e){ }
    }
  }
}
like image 917
Master-Jimmy Avatar asked Feb 08 '13 17:02

Master-Jimmy


People also ask

Is gzip CPU intensive?

Gzip compression is a CPU-dependent process that has different compression levels. Higher compression levels result in smaller files but are more CPU-intensive. Developers can choose how much to compress – as well as what to compress – based on the needs of the site or application they are responsible for.

What is compression ratio for gzip?

Gzip has a high compression ratio around 95% with CSV and JSON.

Which is better compress or gzip?

TL;DR: gzip is better than compress . compress is slower than gzip -1 when compressing, it compresses only half as well, but. it is 29% faster when decompressing.


1 Answers

The problem is that you are not closing the GZIPOutputStream. Until you close it the output will be incomplete.

You just need to close it before reading the byte array. You need to reorder the finally blocks to achieve this.

import java.io.*;
import java.util.zip.*;
import java.nio.charset.*;

public class Zipper
{
  public static void main(String[] args)
  {    
    byte[] dataToCompress = "This is the test data."
      .getBytes(StandardCharsets.ISO_8859_1);

    try
    {
      ByteArrayOutputStream byteStream =
        new ByteArrayOutputStream(dataToCompress.length);
      try
      {
        GZIPOutputStream zipStream =
          new GZIPOutputStream(byteStream);
        try
        {
          zipStream.write(dataToCompress);
        }
        finally
        {
          zipStream.close();
        }
      }
      finally
      {
        byteStream.close();
      }

      byte[] compressedData = byteStream.toByteArray();

      FileOutputStream fileStream =
        new FileOutputStream("C:/Users/UserName/Desktop/zip_file.gz");
      try
      {
        fileStream.write(compressedData);
      }
      finally
      {
        try{ fileStream.close(); }
          catch(Exception e){ /* We should probably delete the file now? */ }
      }
    }
    catch(Exception e)
    {
      e.printStackTrace();
    }
  }
}

I do not recommend inititalizing the stream variables to null, because it means your finally block can also throw a NullPointerException.

Also note that you can declare main to throw IOException (then you would not need the outermost try statement.)

There is little point in swallowing exceptions from zipStream.close();, because if it throws an exception you will not have a valid .gz file (so you should not proceed to write it.)

Also I would not swallow exceptions from byteStream.close(); but for a different reason - they should never be thrown (i.e. there is a bug in your JRE and you would want to know about that.)

like image 169
finnw Avatar answered Sep 19 '22 09:09

finnw