Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hack for a "real" Java flush on a remote/virtual disk

I'm looking for a "trick" or an "hack" to be certain that a file has been persisted on a remote disk, passing through vmware cache, NAS cache, etc.

Flushing and closing a FileOutputStream is not enough. I think Channel.force(true) is neither.

I'm thinking about something like these:

  • write the file and read back the file
  • write the file, check timestamp, rename the file, check for a different timestamp
  • write the file with "wrong content", overwrite with the original content, read it back and check the content

maybe someone had the same problem and found a solution.

My requirement is not to lose data. The java application works in this way:

  1. accept a file from a remote source
  2. add a digital signature and a certified timestamp creating a new file. If this file is lost it cannot be recreated in any way.
  3. write this file to the storage
  4. mark the file as signed on the database
  5. tell the remote side that everything is ok

Tonight we had a crash and three transactions failed after step 5 but before the data was actually flushed to the remote store. So the database says that everything is fine, the remote side was told the same but 15 seconds of signed data was lost. And this is no good.

The correct solution could be to do a "synch mount" of the remote file-system. But this is not going to happen in a short time. Even in this case I do not completely trust this scenario given that the app is running on a VMWare server.

So I'd like to have a "best effort hack" to prevent (mitigate) incidents like this one.

like image 774
lorenzo Avatar asked Nov 04 '22 14:11

lorenzo


1 Answers

Let's start with one assumption: you cannot guarantee any single write to any single disk. There are just too many layers of software and hardware between your write and the disk platter. And even if you could guarantee the write, you cannot guarantee that the data will be readable. It's possible that the disk will crash between the write and the read.

The only solution is redundancy, either provided by a framework (eg, RDMS) or your app.

When you receive and sign the file, you need to send it to multiple destinations on different physical hosts, and wait for them to reply that they saved the file. One of them might crash. Two of them might crash. How important the data is will determine how many remote hosts you need.

Incidentally, redundancy also applies to your database. The fact that a transaction committed does not mean that you'll be able to recover it after a database crash (although DBMS engineers have a lot more experience than either you or I in ensuring writes, all of it depends on a sysadmin who understands things like "logs and datafiles must reside on separate physical drives). I strongly recommend that you (redundantly) store enough metadata along with the file to be able to reconstruct the database entry.

like image 127
parsifal Avatar answered Nov 13 '22 23:11

parsifal