While profiling an application I noticed that RandomAccessFile.writeLong was taking a lot of time.
I checked the code for this method, and it involves eight calls of the native method write. I wrote an alternative implementation for writeLong using a byte[]. Something like this:
RandomAccessFile randomAccessFile = new RandomAccessFile("out.dat", "rwd");
...
byte[] aux = new byte[8];
aux[0] = (byte) ((l >>> 56) & 0xFF);
aux[1] = (byte) ((l >>> 48) & 0xFF);
aux[2] = (byte) ((l >>> 40) & 0xFF);
aux[3] = (byte) ((l >>> 32) & 0xFF);
aux[4] = (byte) ((l >>> 24) & 0xFF);
aux[5] = (byte) ((l >>> 16) & 0xFF);
aux[6] = (byte) ((l >>> 8) & 0xFF);
aux[7] = (byte) ((l >>> 0) & 0xFF);
randomAccessFile.write(aux);
I made a small benchmark and got these results:
Using writeLong():
Average time for invocation: 91 msUsing write(byte[]):
Average time for invocation: 11 ms
Test run on a Linux machine with a Intel(R) CPU T2300 @ 1.66GHz
Since native calls have some performance penalty, why is writeLong implemented that way? I know the question should be made to the Sun guys, but I hope someone in here has some hints.
Thank you.
It appears that the RandomAccessFile.writeLong() doesn't minimise the number of calls to the OS. The cost increases dramatically by using "rwd" instead of "rw" which should be enough to indicate its not the calls themselves which cost the time. (its the fact the OS is try to commit every write to disk and the disk only spins so fast)
{
RandomAccessFile raf = new RandomAccessFile("test.dat", "rwd");
int longCount = 10000;
long start = System.nanoTime();
for (long l = 0; l < longCount; l++)
raf.writeLong(l);
long time = System.nanoTime() - start;
System.out.printf("writeLong() took %,d us on average%n", time / longCount / 1000);
raf.close();
}
{
RandomAccessFile raf = new RandomAccessFile("test2.dat", "rwd");
int longCount = 10000;
long start = System.nanoTime();
byte[] aux = new byte[8];
for (long l = 0; l < longCount; l++) {
aux[0] = (byte) (l >>> 56);
aux[1] = (byte) (l >>> 48);
aux[2] = (byte) (l >>> 40);
aux[3] = (byte) (l >>> 32);
aux[4] = (byte) (l >>> 24);
aux[5] = (byte) (l >>> 16);
aux[6] = (byte) (l >>> 8);
aux[7] = (byte) l;
raf.write(aux);
}
long time = System.nanoTime() - start;
System.out.printf("write byte[8] took %,d us on average%n", time / longCount / 1000);
raf.close();
}
prints
writeLong() took 2,321 us on average
write byte[8] took 576 us on average
It would appear to me that you have no disk write caching on. Without disk caching, I would expect each commited write to take about 11 ms for a 5400 RPM disk ie 60000 ms/5400 => 11 ms.
I would vote for laziness, or (being more charitable) not thinking about the consequences.
A native implementation of writeLong()
would potentially require versions for every architecture, to deal with byte ordering (JNI will convert to platform byte order). By keeping the translation in the "cross-platform" layer, the developers simplified the job of porting.
As to why they didn't convert to an array while on the Java side, I suspect that was due to fear of garbage collection. I would guess that RandomAccessFile
has changed minimally since 1.1, and it wasn't until 1.3 that garbage collection started to make small object allocations "free".
But, there's an alternative to RandomAccessFile
: take a look at MappedByteBuffer
Edit: I have a machine with JDK 1.2.2, and this method has not changed since then.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With