Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RandomAccessFile.setLength much slower on Java 10 (Centos)

Tags:

The following code

public class Main {
    public static void main(String[] args) throws IOException {
        File tmp = File.createTempFile("deleteme", "dat");
        tmp.deleteOnExit();
        RandomAccessFile raf = new RandomAccessFile(tmp, "rw");
        for (int t = 0; t < 10; t++) {
            long start = System.nanoTime();
            int count = 5000;
            for (int i = 1; i < count; i++)
                raf.setLength((i + t * count) * 4096);
            long time = System.nanoTime() - start;
            System.out.println("Average call time " + time / count / 1000 + " us.");
        }
    }
}

On Java 8, this runs fine (the file is on tmpfs so you would expect it to be trivial)

Average call time 1 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.
Average call time 0 us.

On Java 10, this get increasing slower as the file gets larger

Average call time 311 us.
Average call time 856 us.
Average call time 1423 us.
Average call time 1975 us.
Average call time 2530 us.
Average call time 3045 us.
Average call time 3599 us.
Average call time 4034 us.
Average call time 4523 us.
Average call time 5129 us.

Is there a way to diagnose this kind of problem?

Is there any solution or alternative which works efficiently on Java 10?

NOTE: We could write to the end of the file, however this would require locking it which we want to avoid doing.

For comparison, On Windows 10, Java 8 (not tmpfs)

Average call time 542 us.
Average call time 487 us.
Average call time 480 us.
Average call time 490 us.
Average call time 507 us.
Average call time 559 us.
Average call time 498 us.
Average call time 526 us.
Average call time 489 us.
Average call time 504 us.

Windows 10, Java 10.0.1

Average call time 586 us.
Average call time 508 us.
Average call time 615 us.
Average call time 599 us.
Average call time 580 us.
Average call time 577 us.
Average call time 557 us.
Average call time 572 us.
Average call time 578 us.
Average call time 554 us.

UPDATE It appears that the choice of system call has changed between Java 8 and 10. This can be seen by prepending strace -f to the start of the command line

In Java 8, the following calls are repeated in the inner loop

[pid 49027] ftruncate(23, 53248)        = 0
[pid 49027] lseek(23, 0, SEEK_SET)      = 0
[pid 49027] lseek(23, 0, SEEK_CUR)      = 0

In Java 10, the following calls are repeated

[pid   444] fstat(8, {st_mode=S_IFREG|0664, st_size=126976, ...}) = 0
[pid   444] fallocate(8, 0, 0, 131072)  = 0
[pid   444] lseek(8, 0, SEEK_SET)       = 0
[pid   444] lseek(8, 0, SEEK_CUR)       = 0

In particular, fallocate does a lot more work than ftruncate and the time taken appears to be proportional to the length of the file, not the length added to the file.

One work around is to;

  • use reflection to the fd file descriptor
  • use JNA or FFI to call ftruncate.

This seems like a hacky solution. Is there a better alternative in Java 10?

like image 658
Peter Lawrey Avatar asked May 21 '18 13:05

Peter Lawrey


People also ask

What is RandomAccessFile in Java?

RandomAccessFile(File file, String mode) Creates a random access file stream to read from, and optionally to write to, the file specified by the File argument. RandomAccessFile(String name, String mode) Creates a random access file stream to read from, and optionally to write to, a file with the specified name.

What does random access file do?

Java RandomAccessFile provides the facility to read and write data to a file. RandomAccessFile works with file as large array of bytes stored in the file system and a cursor using which we can move the file pointer position.

Which method in the RandomAccessFile class provides for random access?

RandomAccessFile Class provides a way to random access files using reading and writing operations. It works like an array of byte storted in the File. Syntax : public int read() Parameters : -------- Return : reads byte of data from file, -1 if end of file is reached.

Which method of random Accessfile class reads a line from the file and returns it as a string?

RandomAccessFile. readLine() method reads the next line of text from this file.


1 Answers

Is there a way to diagnose this kind of problem?

You can use kernel-aware Java profiler like async-profiler.

Here is what it shows for JDK 8:

JDK 8 profile for RandomAccessFile.setLength

and for JDK 10:

JDK 10 profile for RandomAccessFile.setLength

The profiles confirm your conclusion that RandomAccessFile.setLength uses ftruncate syscall on JDK 8, but a much heavier fallocate on JDK 10.

ftruncate is really fast, because it only updates file metadata, while fallocate indeed allocates disk space (or physical memory in case of tmpfs).

This change was made in attempt to fix JDK-8168628: SIGBUS when extending file size to map it. But later it was realized that this is a bad idea, and the fix was reverted in JDK 11: JDK-8202261.

Is there any solution or alternative which works efficiently on Java 10?

There is an internal class sun.nio.ch.FileDispatcherImpl which has static truncate0 method. It uses ftruncate syscall under the hood. You can call it via Reflection, having in mind that this is a private unsupported API.

Class<?> c = Class.forName("sun.nio.ch.FileDispatcherImpl");
Method m = c.getDeclaredMethod("truncate0", FileDescriptor.class, long.class);
m.setAccessible(true);
m.invoke(null, raf.getFD(), length);
like image 109
apangin Avatar answered Sep 27 '22 20:09

apangin