The gzip input/output stream dont operate on Java direct buffers. Is there any compression algorithm implementation out there that operates directly on direct buffers? This way there would be no overhead of copying a direct buffer to a java byte array for compression.

I don't mean to detract from your question, but is this really a good optimization point in your program? Have you verified with a profiler that you indeed have a problem? Your question as stated implies you have not done any research, but are merely guessing that you will have a performance or memory problem by allocating a byte[]. Since all the answers in this thread are likely to be hacks of some sort, you should really verify that you actually have a problem before fixing it. Back to the question, if you're wanting to compress the data "in place" in on a ByteBuffer, the answer is no, there is no capability to do that built into Java. If you allocated your buffer like the following: <pre class="prettyprint"><code>byte[] bytes = getMyData(); ByteBuffer buf = ByteBuffer.wrap(bytes); </code></pre> You can filter your byte[] through a ByteBufferInputStream as the previous answer suggested.

Wow old question, but stumbled upon this today. Probably some libs like zip4j can handle this, but you can get the job done with no external dependencies since Java 11: If you are interested only in compressing data, you can just do: <pre class="prettyprint"><code>void compress(ByteBuffer src, ByteBuffer dst) { var def = new Deflater(Deflater.DEFAULT_COMPRESSION, true); try { def.setInput(src); def.finish(); def.deflate(dst, Deflater.SYNC_FLUSH); if (src.hasRemaining()) { throw new RuntimeException("dst too small"); } } finally { def.end(); } } </code></pre> Both src and dst will change positions, so you might have to flip them after compress returns. In order to recover compressed data: <pre class="prettyprint"><code>void decompress(ByteBuffer src, ByteBuffer dst) throws DataFormatException { var inf = new Inflater(true); try { inf.setInput(src); inf.inflate(dst); if (src.hasRemaining()) { throw new RuntimeException("dst too small"); } } finally { inf.end(); } } </code></pre> Note that both methods expect (de-)compression to happen in a single pass, however, we could use slight modified versions in order to stream it: <pre class="prettyprint"><code>void compress(ByteBuffer src, ByteBuffer dst, Consumer<ByteBuffer> sink) { var def = new Deflater(Deflater.DEFAULT_COMPRESSION, true); try { def.setInput(src); def.finish(); int cmp; do { cmp = def.deflate(dst, Deflater.SYNC_FLUSH); if (cmp > 0) { sink.accept(dst.flip()); dst.clear(); } } while (cmp > 0); } finally { def.end(); } } void decompress(ByteBuffer src, ByteBuffer dst, Consumer<ByteBuffer> sink) throws DataFormatException { var inf = new Inflater(true); try { inf.setInput(src); int dec; do { dec = inf.inflate(dst); if (dec > 0) { sink.accept(dst.flip()); dst.clear(); } } while (dec > 0); } finally { inf.end(); } } </code></pre> Example: <pre class="prettyprint"><code>void compressLargeFile() throws IOException { var in = FileChannel.open(Paths.get("large")); var temp = ByteBuffer.allocateDirect(1024 * 1024); var out = FileChannel.open(Paths.get("large.zip")); var start = 0; var rem = ch.size(); while (rem > 0) { var mapped=Math.min(16*1024*1024, rem); var src = in.map(MapMode.READ_ONLY, start, mapped); compress(src, temp, (bb) -> { try { out.write(bb); } catch (IOException e) { throw new UncheckedIOException(e); } }); rem-=mapped; } } </code></pre> If you want fully zip compliant data: <pre class="prettyprint"><code>void zip(ByteBuffer src, ByteBuffer dst) { var u = src.remaining(); var crc = new CRC32(); crc.update(src.duplicate()); writeHeader(dst); compress(src, dst); writeTrailer(crc, u, dst); } </code></pre> Where: <pre class="prettyprint"><code>void writeHeader(ByteBuffer dst) { var header = new byte[] { (byte) 0x8b1f, (byte) (0x8b1f >> 8), Deflater.DEFLATED, 0, 0, 0, 0, 0, 0, 0 }; dst.put(header); } </code></pre> And: <pre class="prettyprint"><code>void writeTrailer(CRC32 crc, int uncompressed, ByteBuffer dst) { if (dst.order() == ByteOrder.LITTLE_ENDIAN) { dst.putInt((int) crc.getValue()); dst.putInt(uncompressed); } else { dst.putInt(Integer.reverseBytes((int) crc.getValue())); dst.putInt(Integer.reverseBytes(uncompressed)); } </code></pre> So, zip imposes 10+8 bytes of overhead. In order to unzip a direct buffer into another, you can wrap the src buffer into an InputStream: <pre class="prettyprint"><code>class ByteBufferInputStream extends InputStream { final ByteBuffer bb; public ByteBufferInputStream(ByteBuffer bb) { this.bb = bb; } @Override public int available() throws IOException { return bb.remaining(); } @Override public int read() throws IOException { return bb.hasRemaining() ? bb.get() & 0xFF : -1; } @Override public int read(byte[] b, int off, int len) throws IOException { var rem = bb.remaining(); if (rem == 0) { return -1; } len = Math.min(rem, len); bb.get(b, off, len); return len; } @Override public long skip(long n) throws IOException { var rem = bb.remaining(); if (n > rem) { bb.position(bb.limit()); n = rem; } else { bb.position((int) (bb.position() + n)); } return n; } } </code></pre> and use: <pre class="prettyprint"><code>void unzip(ByteBuffer src, ByteBuffer dst) throws IOException { try (var is = new ByteBufferInputStream(src); var gis = new GZIPInputStream(is)) { var tmp = new byte[1024]; var r = gis.read(tmp); if (r > 0) { do { dst.put(tmp, 0, r); r = gis.read(tmp); } while (r > 0); } } } </code></pre> Of course, this is not cool since we are copying data to a temporary array, but nevertheless, it is sort of a roundtrip check that proves that nio-based zip encoding writes valid data that can be read from standard io-based consumers. So, if we just ignore crc consistency checks we can just drop header/footer: <pre class="prettyprint"><code>void unzipNoCheck(ByteBuffer src, ByteBuffer dst) throws DataFormatException { src.position(src.position() + 10).limit(src.limit() - 8); decompress(src, dst); } </code></pre>

compression on java nio direct buffers

2 Answers

I don't mean to detract from your question, but is this really a good optimization point in your program? Have you verified with a profiler that you indeed have a problem? Your question as stated implies you have not done any research, but are merely guessing that you will have a performance or memory problem by allocating a byte[]. Since all the answers in this thread are likely to be hacks of some sort, you should really verify that you actually have a problem before fixing it.

Back to the question, if you're wanting to compress the data "in place" in on a ByteBuffer, the answer is no, there is no capability to do that built into Java.

If you allocated your buffer like the following:

byte[] bytes = getMyData();
ByteBuffer buf = ByteBuffer.wrap(bytes);

You can filter your byte[] through a ByteBufferInputStream as the previous answer suggested.

161

answered Oct 06 '22 01:10

Jonathan S. Fisher

Wow old question, but stumbled upon this today.

Probably some libs like zip4j can handle this, but you can get the job done with no external dependencies since Java 11:

If you are interested only in compressing data, you can just do:

void compress(ByteBuffer src, ByteBuffer dst) {
    var def = new Deflater(Deflater.DEFAULT_COMPRESSION, true);
    try {
        def.setInput(src);
        def.finish();
        def.deflate(dst, Deflater.SYNC_FLUSH);

        if (src.hasRemaining()) {
            throw new RuntimeException("dst too small");
        }
    } finally {
        def.end();
    }
}

Both src and dst will change positions, so you might have to flip them after compress returns.

In order to recover compressed data:

void decompress(ByteBuffer src, ByteBuffer dst) throws DataFormatException {
    var inf = new Inflater(true);
    try {
        inf.setInput(src);
        inf.inflate(dst);

        if (src.hasRemaining()) {
            throw new RuntimeException("dst too small");
        }

    } finally {
        inf.end();
    }
}

Note that both methods expect (de-)compression to happen in a single pass, however, we could use slight modified versions in order to stream it:

void compress(ByteBuffer src, ByteBuffer dst, Consumer<ByteBuffer> sink) {
    var def = new Deflater(Deflater.DEFAULT_COMPRESSION, true);
    try {
        def.setInput(src);
        def.finish();
        int cmp;
        do {
            cmp = def.deflate(dst, Deflater.SYNC_FLUSH);
            if (cmp > 0) {
                sink.accept(dst.flip());
                dst.clear();
            }
        } while (cmp > 0);
    } finally {
        def.end();
    }
}

void decompress(ByteBuffer src, ByteBuffer dst, Consumer<ByteBuffer> sink) throws DataFormatException {
    var inf = new Inflater(true);
    try {
        inf.setInput(src);
        int dec;
        do {
            dec = inf.inflate(dst);

            if (dec > 0) {
                sink.accept(dst.flip());
                dst.clear();
            }

        } while (dec > 0);
    } finally {
        inf.end();
    }
}

Example:

void compressLargeFile() throws IOException {
    var in = FileChannel.open(Paths.get("large"));
    var temp = ByteBuffer.allocateDirect(1024 * 1024);
    var out = FileChannel.open(Paths.get("large.zip"));

    var start = 0;
    var rem = ch.size();
    while (rem > 0) {
        var mapped=Math.min(16*1024*1024, rem);
        var src = in.map(MapMode.READ_ONLY, start, mapped);

        compress(src, temp, (bb) -> {
            try {
                out.write(bb);
            } catch (IOException e) {
                throw new UncheckedIOException(e);
            }
        });
        
        rem-=mapped;
    }
}

If you want fully zip compliant data:

void zip(ByteBuffer src, ByteBuffer dst) {
    var u = src.remaining();
    var crc = new CRC32();
    crc.update(src.duplicate());
    writeHeader(dst);

    compress(src, dst);

    writeTrailer(crc, u, dst);
}

Where:

void writeHeader(ByteBuffer dst) {
    var header = new byte[] { (byte) 0x8b1f, (byte) (0x8b1f >> 8), Deflater.DEFLATED, 0, 0, 0, 0, 0, 0, 0 };
    dst.put(header);
}

And:

void writeTrailer(CRC32 crc, int uncompressed, ByteBuffer dst) {
    if (dst.order() == ByteOrder.LITTLE_ENDIAN) {
        dst.putInt((int) crc.getValue());
        dst.putInt(uncompressed);
    } else {
        dst.putInt(Integer.reverseBytes((int) crc.getValue()));
        dst.putInt(Integer.reverseBytes(uncompressed));
    }

So, zip imposes 10+8 bytes of overhead.

In order to unzip a direct buffer into another, you can wrap the src buffer into an InputStream:

class ByteBufferInputStream extends InputStream {

    final ByteBuffer bb;

    public ByteBufferInputStream(ByteBuffer bb) {
        this.bb = bb;
    }

    @Override
    public int available() throws IOException {
        return bb.remaining();
    }

    @Override
    public int read() throws IOException {
        return bb.hasRemaining() ? bb.get() & 0xFF : -1;
    }

    @Override
    public int read(byte[] b, int off, int len) throws IOException {
        var rem = bb.remaining();

        if (rem == 0) {
            return -1;
        }

        len = Math.min(rem, len);

        bb.get(b, off, len);

        return len;
    }

    @Override
    public long skip(long n) throws IOException {
        var rem = bb.remaining();

        if (n > rem) {
            bb.position(bb.limit());
            n = rem;
        } else {
            bb.position((int) (bb.position() + n));
        }

        return n;
    }
}

and use:

void unzip(ByteBuffer src, ByteBuffer dst) throws IOException {
    try (var is = new ByteBufferInputStream(src); var gis = new GZIPInputStream(is)) {
        var tmp = new byte[1024];

        var r = gis.read(tmp);

        if (r > 0) {
            do {
                dst.put(tmp, 0, r);
                r = gis.read(tmp);
            } while (r > 0);
        }

    }
}

Of course, this is not cool since we are copying data to a temporary array, but nevertheless, it is sort of a roundtrip check that proves that nio-based zip encoding writes valid data that can be read from standard io-based consumers.

So, if we just ignore crc consistency checks we can just drop header/footer:

void unzipNoCheck(ByteBuffer src, ByteBuffer dst) throws DataFormatException {
    src.position(src.position() + 10).limit(src.limit() - 8);

    decompress(src, dst);
}

answered Oct 06 '22 02:10

Cleber Muramoto

Related questions
                            
                                Some changes on Soundex Algorithm
                            
                                Does the Width and Height change with orientation?
                            
                                Behaviour of JVM during out of memory error? List s = new ArrayList<String>();
                            
                                Java NIO and Windows disk access
                            
                                Maven + GAE step-by-step
                            
                                Spring @Async and Synchronized
                            
                                Setting a time-out limit to readLine()?
                            
                                List of resources in a folder of jar file?
                            
                                EclipseLink and log4j: how to use both
                            
                                Java - Is it possible to output the stacktrace with method signatures?
                            
                                Compiling Only One Class in a Project
                            
                                Can you use Java Annotations to evaluate something in a method?
                            
                                How to parse an XML file containing BOM?
                            
                                Reading lines from an InputStream without buffering
                            
                                Java ldap authentication issue
                            
                                package vs. protected protection with Java reflection
                            
                                Can't get ActiveMQ to resend my messages
                            
                                Spring AOP: Only Context Beans Could be Adviced?
                            
                                "Restful" Java WEB MVC frameworks
                            
                                Determine which aspects hook into a given class

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

compression on java nio direct buffers

Tags:

java

compression

gzip

nio

pdeva

People also ask

2 Answers

Jonathan S. Fisher

Cleber Muramoto

Recent Activity

Donate For Us