I have to apply to each byte of my stream bit operations and arithmetical operations.
I identified the for loop in the code example as bottle neck of my output stream and like to optimize it. I'm just out of ideas ;)
private static final long A = 0x1ABCDE361L;
private static final long C = 0x87;
private long x;
//This method belongs to a class that extends java.io.FilteredOutputStream
@Override
public void write(byte[] buffer, int offset, int length) throws IOException {
for (int i = 0; i < length; i++) {
x = A * x + C & 0xffffffffffffL;
buffer[offset + i] =
(byte) (buffer[offset + i] ^ (x>>>16));
}
out.write(buffer, offset, length);
}
The code is primarily used on Android devices.
I seek for at least 50% boost of the execution time. I learnt from my benchmarks with CRC32 that CRC32#update(byte[] b, int off, int len) is ten times faster than CRC32#update(byte b) on chunks greater than 30 bytes. (My chunks are > 4096 bytes) So, I guess I need some implementation that processes an array at once.
The follwing is a little bit faster on 32 bit cpus:
private static final long A = 0x1ABCDE361L;
private static final long C = 0x87;
private long x;
//This method belongs to a class that extends java.io.FilteredOutputStream
@Override
public void write(byte[] buffer, int offset, int length) throws IOException {
for (int i = 0; i < length; i++) {
x = A * x + C;
buffer[offset + i] = (byte) (buffer[offset + i] ^ ((int)x>>>16));
}
out.write(buffer, offset, length);
}
Because of the right-shift of x by 16 bit and the casting to byte of the result of the xor-operation, effectively only the bits 16. to 23. are used of x, so it can be casted to 32 bits before the right-shift operation, making two operations faster on 32 bit cpus.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With