Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is this a bug in the Java GZipInputStream class?

I noticed that some of my gzip decoding code seemed to be failing to detect corrupted data. I think that I have traced the problem to the Java GZipInputStream class. In particular, it seems that when you read the entire stream with a single 'read' call, corrupted data doesn't trigger an IOException. If you read the stream in 2 or more calls on the same corrupted data, then it does trigger an exception.

I wanted to see what the community here thought before I consider filing a bug report.

EDIT: I have modified my example because the last one did not as clearly illustrate what I perceive to be the issue. In this new example, a 10 byte buffer is gzipped, one byte of the gzipped buffer is modified, then it is ungzipped. The call to 'GZipInputStream.read' returns 10 as the number of bytes read, which is what you would expect for a 10 byte buffer. Nevertheless, the unzipped buffer is different than the original (due to the corruption). No exception is thrown. I did note that calling 'available' after the read returns '1' instead of '0' which it would if the EOF had been reached.

Here is the source:

  @Test public void gzip() {
    try {
      int length = 10;
      byte[] bytes = new byte[]{12, 19, 111, 14, -76, 34, 60, -43, -91, 101};
      System.out.println(Arrays.toString(bytes));

      //Gzip the byte array
      ByteArrayOutputStream baos = new ByteArrayOutputStream();
      GZIPOutputStream gos = new GZIPOutputStream(baos);
      gos.write(bytes);
      gos.finish();
      byte[] zipped = baos.toByteArray();

      //Alter one byte of the gzipped array.  
      //This should be detected by gzip crc-32 checksum
      zipped[15] = (byte)(0);

      //Unzip the modified array
      ByteArrayInputStream bais = new ByteArrayInputStream(zipped);
      GZIPInputStream gis = new GZIPInputStream(bais);
      byte[] unzipped = new byte[length];
      int numRead = gis.read(unzipped);
      System.out.println("NumRead: " + numRead);
      System.out.println("Available: " + gis.available());

      //The unzipped array is now [12, 19, 111, 14, -80, 0, 0, 0, 10, -118].
      //No IOException was thrown.
      System.out.println(Arrays.toString(unzipped));

      //Assert that the input and unzipped arrays are equal (they aren't)
      org.junit.Assert.assertArrayEquals(unzipped, bytes);
    } catch (IOException e) {
      e.printStackTrace();
    }
  }
like image 711
Jacob Avatar asked Mar 11 '11 18:03

Jacob


People also ask

How do I use GZIPInputStream?

To use the Java GZIPInputStream you must first create a GZIPInputStream instance. Here is an example of creating a GZIPInputStream instance: InputStream fileInputStream = new FileInputStream("myfile. zip"); GZIPInputStream gzipInputStream = new GZIPInputStream(fileInputStream);

What is GZIP input stream?

GZIPInputStream(InputStream in) Creates a new input stream with a default buffer size. GZIPInputStream(InputStream in, int size) Creates a new input stream with the specified buffer size.


1 Answers

Decided to run the test:

What you have missed. gis.read(unzipped) returns 1, so it has read only a single byte. You can't complain, it's not the end of the stream.

The next read() throws "Corrupt GZIP trailer".

So it's all good! (and there are no bugs, at least in GZIPInputStream)

like image 197
bestsss Avatar answered Sep 23 '22 15:09

bestsss