The following code
public class Main {
public static void main(String[] args) throws IOException {
File tmp = File.createTempFile("deleteme", "dat");
tmp.deleteOnExit();
RandomAccessFile raf = new RandomAccessFile(tmp, "rw");
for (int t = 0; t < 10; t++) {
long start = System.nanoTime();
int count = 5000;
for (int i = 1; i < count; i++)
raf.setLength((i + t * count) * 4096);
long time = System.nanoTime() - start;
System.out.println("Average call time " + time / count / 1000 + " us.");
}
}
}
On Java 8, this runs fine (the file is on tmpfs so you would expect it to be trivial)
Average call time 1 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
On Java 10, this get increasing slower as the file gets larger
Average call time 311 us.
Average call time 856 us.
Average call time 1423 us.
Average call time 1975 us.
Average call time 2530 us.
Average call time 3045 us.
Average call time 3599 us.
Average call time 4034 us.
Average call time 4523 us.
Average call time 5129 us.
Is there a way to diagnose this kind of problem?
Is there any solution or alternative which works efficiently on Java 10?
NOTE: We could write to the end of the file, however this would require locking it which we want to avoid doing.
For comparison, On Windows 10, Java 8 (not tmpfs)
Average call time 542 us.
Average call time 487 us.
Average call time 480 us.
Average call time 490 us.
Average call time 507 us.
Average call time 559 us.
Average call time 498 us.
Average call time 526 us.
Average call time 489 us.
Average call time 504 us.
Windows 10, Java 10.0.1
Average call time 586 us.
Average call time 508 us.
Average call time 615 us.
Average call time 599 us.
Average call time 580 us.
Average call time 577 us.
Average call time 557 us.
Average call time 572 us.
Average call time 578 us.
Average call time 554 us.
UPDATE It appears that the choice of system call has changed between Java 8 and 10. This can be seen by prepending strace -f
to the start of the command line
In Java 8, the following calls are repeated in the inner loop
[pid 49027] ftruncate(23, 53248) = 0
[pid 49027] lseek(23, 0, SEEK_SET) = 0
[pid 49027] lseek(23, 0, SEEK_CUR) = 0
In Java 10, the following calls are repeated
[pid 444] fstat(8, {st_mode=S_IFREG|0664, st_size=126976, ...}) = 0
[pid 444] fallocate(8, 0, 0, 131072) = 0
[pid 444] lseek(8, 0, SEEK_SET) = 0
[pid 444] lseek(8, 0, SEEK_CUR) = 0
In particular, fallocate
does a lot more work than ftruncate
and the time taken appears to be proportional to the length of the file, not the length added to the file.
One work around is to;
fd
file descriptorThis seems like a hacky solution. Is there a better alternative in Java 10?
RandomAccessFile(File file, String mode) Creates a random access file stream to read from, and optionally to write to, the file specified by the File argument. RandomAccessFile(String name, String mode) Creates a random access file stream to read from, and optionally to write to, a file with the specified name.
Java RandomAccessFile provides the facility to read and write data to a file. RandomAccessFile works with file as large array of bytes stored in the file system and a cursor using which we can move the file pointer position.
RandomAccessFile Class provides a way to random access files using reading and writing operations. It works like an array of byte storted in the File. Syntax : public int read() Parameters : -------- Return : reads byte of data from file, -1 if end of file is reached.
RandomAccessFile. readLine() method reads the next line of text from this file.
Is there a way to diagnose this kind of problem?
You can use kernel-aware Java profiler like async-profiler.
Here is what it shows for JDK 8:
and for JDK 10:
The profiles confirm your conclusion that RandomAccessFile.setLength
uses ftruncate
syscall on JDK 8, but a much heavier fallocate
on JDK 10.
ftruncate
is really fast, because it only updates file metadata, while fallocate
indeed allocates disk space (or physical memory in case of tmpfs
).
This change was made in attempt to fix JDK-8168628: SIGBUS when extending file size to map it. But later it was realized that this is a bad idea, and the fix was reverted in JDK 11: JDK-8202261.
Is there any solution or alternative which works efficiently on Java 10?
There is an internal class sun.nio.ch.FileDispatcherImpl
which has static truncate0
method. It uses ftruncate
syscall under the hood. You can call it via Reflection, having in mind that this is a private unsupported API.
Class<?> c = Class.forName("sun.nio.ch.FileDispatcherImpl");
Method m = c.getDeclaredMethod("truncate0", FileDescriptor.class, long.class);
m.setAccessible(true);
m.invoke(null, raf.getFD(), length);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With