Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Implementation of *nix sync() without a system call

I'm working to remove all system calls from an existing Java code base. We run our application in a commercially provided, closed-source, JVM. When the JVM makes a system call via a getRuntime.exec() java call the entire JVM process forks which leads to serious performance hits. We run on a linux platform but ideally try to keep things as portable as possible.

I'm running into problems replacing a sync() call we currently use via the getRuntime.exec() method. I know there is this sync() method and flush() as well. And based on this post I'm looking to do a sync and flush with all open file streams.

My issue is that I don't have direct knowledge of what file streams and descriptors are out there. I thought one way around this would be to check the /proc/(jvm process number)/fd folder but I can't find a good way to reliably get the JVM process number using pure java. I thought I might be able to get all objects of a certain class (the FileDescriptor class) but from what I'm reading this isn't feasible either.

Does anyone have suggestions on how to duplicate a *nix sync() call in pure java?

like image 428
eharik Avatar asked Oct 07 '22 00:10

eharik


1 Answers

What you are doing is more that a sync call. You are trying to do a "flush all file buffers and sync" operation. You would have trouble doing this in C / C++ too.

In addition to the problem of finding all of the open files (which you probably could solve ...), there is a bigger problem; i.e. whether it is the right time to flush the buffers.

Lets assume that your application is multi-threaded, and that one thread is responsible for calling sync. How does that thread know that other threads that are writing files have reached a consistent point wrt the files; i.e. that if the application was killed and restarted, that the (hypothetically) flushed files would contain a logically consistent state for the application? The answer is (most likely) that it doesn't know. So ... in fact ... the application is not in a significantly better position if it flushes before syncing.

And there is yet another problem. Assuming that thread A is responsible for flush / sync, and thread B is happily writing to some output stream. Consider this temporal sequence:

  1. Thread A flushes file
  2. Thread B writes to file
  3. Thread A calls sync

The only way to avoid this is to have thread A synchronize and block all other threads that are writing to files ... before it does the flush(es) and the sync.

My advice would be to just do the sync, and forget about the flushes. Deal with the problem of inconsistent files the classic way (by having the application write to a temporary file, and do an atomic rename), or by having the sync thread coordinate with the thread(s) writing the file ... so that it only "syncs" when the critical files are consistent.

like image 98
Stephen C Avatar answered Oct 10 '22 01:10

Stephen C