The gzip input/output stream dont operate on Java direct buffers.
Is there any compression algorithm implementation out there that operates directly on direct buffers?
This way there would be no overhead of copying a direct buffer to a java byte array for compression.
A direct buffer is a chunk of native memory shared with Java from which you can perform a direct read. An instance of DirectByteBuffer can be created using the ByteBuffer. allocateDirect() factory method.
Buffers in Java NIO can be treated as a simple object which act as a fixed sized container of data chunks that can be used to write data to channel or read data from channel so that buffers act as endpoints to the channels.
A buffer is essentially a block of memory into which you can write data, which you can then later read again. This memory block is wrapped in a NIO Buffer object, which provides a set of methods that makes it easier to work with the memory block.
ByteBuffer holds a sequence of integer values to be used in an I/O operation. The ByteBuffer class provides the following four categories of operations upon long buffers: Absolute and relative get method that read single bytes. Absolute and relative put methods that write single bytes.
I don't mean to detract from your question, but is this really a good optimization point in your program? Have you verified with a profiler that you indeed have a problem? Your question as stated implies you have not done any research, but are merely guessing that you will have a performance or memory problem by allocating a byte[]. Since all the answers in this thread are likely to be hacks of some sort, you should really verify that you actually have a problem before fixing it.
Back to the question, if you're wanting to compress the data "in place" in on a ByteBuffer, the answer is no, there is no capability to do that built into Java.
If you allocated your buffer like the following:
byte[] bytes = getMyData();
ByteBuffer buf = ByteBuffer.wrap(bytes);
You can filter your byte[] through a ByteBufferInputStream as the previous answer suggested.
Wow old question, but stumbled upon this today.
Probably some libs like zip4j can handle this, but you can get the job done with no external dependencies since Java 11:
If you are interested only in compressing data, you can just do:
void compress(ByteBuffer src, ByteBuffer dst) {
var def = new Deflater(Deflater.DEFAULT_COMPRESSION, true);
try {
def.setInput(src);
def.finish();
def.deflate(dst, Deflater.SYNC_FLUSH);
if (src.hasRemaining()) {
throw new RuntimeException("dst too small");
}
} finally {
def.end();
}
}
Both src and dst will change positions, so you might have to flip them after compress returns.
In order to recover compressed data:
void decompress(ByteBuffer src, ByteBuffer dst) throws DataFormatException {
var inf = new Inflater(true);
try {
inf.setInput(src);
inf.inflate(dst);
if (src.hasRemaining()) {
throw new RuntimeException("dst too small");
}
} finally {
inf.end();
}
}
Note that both methods expect (de-)compression to happen in a single pass, however, we could use slight modified versions in order to stream it:
void compress(ByteBuffer src, ByteBuffer dst, Consumer<ByteBuffer> sink) {
var def = new Deflater(Deflater.DEFAULT_COMPRESSION, true);
try {
def.setInput(src);
def.finish();
int cmp;
do {
cmp = def.deflate(dst, Deflater.SYNC_FLUSH);
if (cmp > 0) {
sink.accept(dst.flip());
dst.clear();
}
} while (cmp > 0);
} finally {
def.end();
}
}
void decompress(ByteBuffer src, ByteBuffer dst, Consumer<ByteBuffer> sink) throws DataFormatException {
var inf = new Inflater(true);
try {
inf.setInput(src);
int dec;
do {
dec = inf.inflate(dst);
if (dec > 0) {
sink.accept(dst.flip());
dst.clear();
}
} while (dec > 0);
} finally {
inf.end();
}
}
Example:
void compressLargeFile() throws IOException {
var in = FileChannel.open(Paths.get("large"));
var temp = ByteBuffer.allocateDirect(1024 * 1024);
var out = FileChannel.open(Paths.get("large.zip"));
var start = 0;
var rem = ch.size();
while (rem > 0) {
var mapped=Math.min(16*1024*1024, rem);
var src = in.map(MapMode.READ_ONLY, start, mapped);
compress(src, temp, (bb) -> {
try {
out.write(bb);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
});
rem-=mapped;
}
}
If you want fully zip compliant data:
void zip(ByteBuffer src, ByteBuffer dst) {
var u = src.remaining();
var crc = new CRC32();
crc.update(src.duplicate());
writeHeader(dst);
compress(src, dst);
writeTrailer(crc, u, dst);
}
Where:
void writeHeader(ByteBuffer dst) {
var header = new byte[] { (byte) 0x8b1f, (byte) (0x8b1f >> 8), Deflater.DEFLATED, 0, 0, 0, 0, 0, 0, 0 };
dst.put(header);
}
And:
void writeTrailer(CRC32 crc, int uncompressed, ByteBuffer dst) {
if (dst.order() == ByteOrder.LITTLE_ENDIAN) {
dst.putInt((int) crc.getValue());
dst.putInt(uncompressed);
} else {
dst.putInt(Integer.reverseBytes((int) crc.getValue()));
dst.putInt(Integer.reverseBytes(uncompressed));
}
So, zip imposes 10+8 bytes of overhead.
In order to unzip a direct buffer into another, you can wrap the src buffer into an InputStream:
class ByteBufferInputStream extends InputStream {
final ByteBuffer bb;
public ByteBufferInputStream(ByteBuffer bb) {
this.bb = bb;
}
@Override
public int available() throws IOException {
return bb.remaining();
}
@Override
public int read() throws IOException {
return bb.hasRemaining() ? bb.get() & 0xFF : -1;
}
@Override
public int read(byte[] b, int off, int len) throws IOException {
var rem = bb.remaining();
if (rem == 0) {
return -1;
}
len = Math.min(rem, len);
bb.get(b, off, len);
return len;
}
@Override
public long skip(long n) throws IOException {
var rem = bb.remaining();
if (n > rem) {
bb.position(bb.limit());
n = rem;
} else {
bb.position((int) (bb.position() + n));
}
return n;
}
}
and use:
void unzip(ByteBuffer src, ByteBuffer dst) throws IOException {
try (var is = new ByteBufferInputStream(src); var gis = new GZIPInputStream(is)) {
var tmp = new byte[1024];
var r = gis.read(tmp);
if (r > 0) {
do {
dst.put(tmp, 0, r);
r = gis.read(tmp);
} while (r > 0);
}
}
}
Of course, this is not cool since we are copying data to a temporary array, but nevertheless, it is sort of a roundtrip check that proves that nio-based zip encoding writes valid data that can be read from standard io-based consumers.
So, if we just ignore crc consistency checks we can just drop header/footer:
void unzipNoCheck(ByteBuffer src, ByteBuffer dst) throws DataFormatException {
src.position(src.position() + 10).limit(src.limit() - 8);
decompress(src, dst);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With