Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I write a gzipped byte array via SFTP (Jsch) without decompressing it?

Tags:

java

gzip

sftp

jsch

(This was x-posted to the Jsch mailing list BTW). I'm reading data from a database and carrying that as a byte[] (for transportation across middleware components).

From that byte[] I know how to create a gzipped file on the local file system by using the GZIPOutputStream class. Want I want to do is create a gzipped file on a remote file system by using the JSch SFTP methods.

I've gzipped the byte[] of data and am passing that as an InputStream to the JSch library for SFTPing to a remote file directory (as a .gz file). However, the file that is delivered has an unexpected EOF and cannot be 'gunzipped'

gunzip: GlobalIssuer.xml.gz: unexpected end of file

Reminder I'm not transferring a byte[] that is the contents of a .gz file, it's the contents of a database record

The (relatively) SSCCE is as follows:

byte[] content = "Content".getBytes();
// It does work (I promise!) returns a 'gzipped' byte[]
byte[] gzippedContent = gzipContent(content);
ByteArrayInputStream bais = new ByteArrayInputStream(gzippedContent);
channelSftp.put(bais, "Content.txt.gz");

The gzipContent method:

private byte[] gzipContent(byte[] content)
{
    ByteArrayInputStream in = new ByteArrayInputStream(content);

    // Create stream to compress data and write it to the to file.
    GZIPOutputStream gzipOutputStream = null;
    ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
    try
    {
        gzipOutputStream = new GZIPOutputStream(byteArrayOutputStream);
        byte[] buffer = new byte[4096];
        int bytes_read;
        while ((bytes_read = in.read(buffer)) != END_OF_FILE)
        {
            gzipOutputStream.write(buffer, 0, bytes_read);
        }

        // Return the gzipped content
        return byteArrayOutputStream.toByteArray();
    }
    catch (IOException e)
    {   
        // Altered from original to make this a SSCCE
        // Don't write exception handling like this at home!
        System.err.println("Unable to gzip content" + e.getMessage());
        return null;
    }
    /* 
     * Lots of closing streams with exception handling below.
     * I *think* I'm closing off streams in the right order
     * It's not triggering any of the System.err.println calls in any case
     * Of course System.err.println is bad, but this is a SSCCE
     */
    finally
    {
        try 
        {
            if (in != null)
            {
                in.close();
            }
        }
        catch (IOException e)
        {
            System.err.println("Was unable to close the Inputstream for gzipping, be aware of mem leak.");
        }
        try 
        {
            if (byteArrayOutputStream != null)
            {
                byteArrayOutputStream.close();
                if (gzipOutputStream != null)
                {
                    gzipOutputStream.close();
                }
            }
        }
        catch (IOException e)
        {
            System.err.println("Was unable to close the OutputStream(s) for gzipping, be aware of mem leak.");
        }
    }
}

The raw content ("Content") in bytes:

0x750x6E0x630x6F0x6D0x700x720x650x730x730x650x640x430x6F0x6E0x740x650x6E0x74

The gzipped content ("Content") in bytes:

0x1F0x8B0x080x000x000x000x000x000x000x00

Or alterantively:

1f8b 0800 0000 0000 0000 

The equivalent gzipped content written out using the GZIPOutputStream and FileOutputStream to the local file system.

1f8b 0800 0000 0000 0000 2bcd 4bce cf2d  ..........+ÍKÎÏ-
284a 2d2e 4e4d 71ce cf2b 49cd 2b01 00f8  (J-.NMqÎÏ+IÍ+..ø
3987 5f13 0000 00                        9._....

I think I see the problem. Although the content is gzipped properly, I haven't created the checksum suffix that gzipped files require (which the GZIPOutputStream does do in conjunction with the FileOutputStream if you're doing this on a local file system). So basically I'm missing this:

2bcd 4bce cf2d  ..........+ÍKÎÏ-
284a 2d2e 4e4d 71ce cf2b 49cd 2b01 00f8  (J-.NMqÎÏ+IÍ+..ø
3987 5f13 0000 00                        9._....

I can't see a method in the Jsch library that would do the trick - which means I think I'm missing some fundamental point.

like image 390
Martijn Verburg Avatar asked Feb 23 '23 23:02

Martijn Verburg


1 Answers

It looks like your problem is in the usage of the GZipOutputStream combined with the ByteArrayOutputStream, and totally unrelated to JSch.

GZipOutputStream is using (by its superclass, a DeflatorOutputStream) a Deflator to do the actual work. This deflator is allowed to buffer any amount of data it deems appropriate, until you use its finish() method (either by the streams finish() or close()) to say that the compressed file is finished. This then also writes the gzip footer including the checksum to the destination output.

I think your problem could be solved by either moving the getByteArray after your close() cascade, or adding a finish() before it.

like image 187
Paŭlo Ebermann Avatar answered Feb 26 '23 23:02

Paŭlo Ebermann