I'm required to work on a serialization library in Java which must be as fast as possible. The idea is to create various methods which will serialize the specified value and its associated key and puts them in a byte buffer. Several objects that wrap this buffer must be created since the objects that need to be serialized are potentially alot.
Considerations: I know the Unsafe class may not be implemented in every JVM, but it's not a problem. Premature optimization: this library has to be fast and this serialization is the only thing it has to do. The objects once serialized are tipically small (less than 10k) but they are alot and they can be up to 2Gb big. The underlying buffer can be expanded / reduced but I'll skip implementation details, the method is similar to the one used in the ArrayList implementation.
To clarify my situation: I have various methods like
public void putByte(short key, byte value);
public void putInt(short key, int value);
public void putFloat(short key, float value);
... and so on...
these methods append the key and the value in a byte stream, so if i call putInt(-1, 1234567890) my buffer would look like: (the stream is big endian)
key the integer value
[0xFF, 0xFF, 0x49, 0x96, 0x02, 0xD2]
In the end a method like toBytes() must be called to return a byte array which is a trimmed (if needed) version of the underlying buffer.
Now, my question is: what is the fastest way to do this in java?
I googled and stumbled upon various pages (some of these were on SO) and I also did some benchmarks (but i'm not really experienced in benchmarks and that's one of the reasons I'm asking for the help of more experienced programmers about this topic).
I came up with the following solutions:
1- The most immediate: a byte array
If I have to serialize an int it would look like this:
public void putInt(short key, int value)
{
array[index] = (byte)(key >> 8);
array[index+1] = (byte) key;
array[index+2] = (byte)(value >> 24);
array[index+3] = (byte)(value >> 16);
array[index+4] = (byte)(value >> 8);
array[index+5] = (byte) value;
}
2- A ByteBuffer (be it direct or a byte array wrapper)
The putInt method would look like the following
public void putInt(short key, int value)
{
byteBuff.put(key).put(value);
}
3- Allocation on native memory through Unsafe
Using the Unsafe class I would allocate the buffer on native memory and so the putInt would look like:
public void putInt(short key, int value)
{
Unsafe.putShort(address, key);
Unsafe.putInt(address+2, value);
}
4- allocation through new byte[], access through Unsafe
I saw this method in the lz4 compression library written in java. Basically once a byte array is instantiated i write bytes the following way:
public void putInt(short key, int value)
{
Unsafe.putShort(byteArray, BYTE_ARRAY_OFFSET + 0, key);
Unsafe.putInt(byteArray, BYTE_ARRAY_OFFSET + 2, value);
}
The methods here are simplified, but the basic idea is the one shown, I also have to implement the getter methods . Now, since i started to work in this i learnt the following things:
1- The JVM can remove array boundary checks if it's safe (in a for loop for example where the counter has to be less to the length of the array) 2- Crossing the JVM memory boundaries (reading/writing from/to native memory) has a cost. 3- Calling a native method may have a cost. 4- Unsafe putters and getters don't make boundary checks in native memory, nor on a regular array. 5- ByteBuffers wrap a byte array (non direct) or a plain native memory area (direct) so case 2 internally would look like case 1 or 3.
I run some benchmarks (but as I said I would like the opinion / experience of other developers) and it seems that case 4 is slightly (almost equals) to case 1 in reading and about 3 times faster in writing. It also seems that a for loop with Unsafe read and write (case 4) to copy an array to another (copying 8 bytes at time) is faster than System.arraycopy.
Long story made short (sorry for the long post):
case 1 seems to be fast, but that way I have to write a single byte each time + masking operations, which makes me think that maybe Unsafe, even if it's a call to native code may be faster.
case 2 is similar to case 1 and 3, so I could skip it (correct me if I'm missing something)
case 3 seems to be the slowest (at least from my benchmarks), also, I would need to copy from a native memory to a byte array because that's must be the output. But here this programmer claims it's the fastest way by far. If I understood correctly, what am I missing?
case 4 (as supported here) seems to be the fastest.
The number of choices and some contradictory information confuse me a bit, so can anyone clarify me these doubts?
I hope I wrote every needed information, otherwise just ask for clarifications.
Thanks in advance.
Case 5: DataOutputStream
writing to a ByteArrayOutputStream.
Pro: it's already done; it's as fast as anything else you've mentioned here; all primitives are already implemented. The converse is DataInputStream reading from a ByteArrayInputStream.
Con: nothing I can think of.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With