Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to guarantee file integrity without mandatory file lock on OS X?

AFAIK, OS X is a BSD derivation, which doesn't have actual mandatory file locking. If so, it seems that I have no way to prevent writing access from other programs even while I am writing a file.

How to guarantee file integrity in such environment? I don't care integrity after my program exited, because that's now user's responsibility. But at least, I think I need some kind of guarantee while my program is running.

How do other programs guarantee file content integrity without mandatory locking? Especially database programs. If there's common technique or recommended practice, please let me know.

Update

I am looking for this for data layer of GUI application for non-engineer users. And currently, my program have this situations.

  • Data is too big that it cannot be fit to RAM. And even hard to be temporarily copied. So it cannot be read/written atomically, and should be used from disk directly while program is running.

  • A long running professional GUI content editor application used by humans who are non-engineers. Though users are not engineers, but they still can access the file simultaneously with Finder or another programs. So users can delete or write on currently using file accidentally. Problem is users don't understand what is actually happening, and expect program handles file integrity at least program is running.

  • I think the only way to guarantee file's integrity in current situation is,

    1. Open file with system-wide exclusive mandatory lock. Now the file is program's responsibility.
    2. Check for integrity.
    3. Use the file as like external memory while program is running.
    4. Write all the modifications.
    5. Unlock. Now the file is user's responsibility.

    Because OS X lacks system-wide mandatory lock, so now I don't know what to do for this. But still I believe there's a way to archive this kind of file integrity, which just I don't know. And I want to know how everybody else handles this.

This question is not about my programming error. That's another problem. Current problem is protecting data from another programs which doesn't respect advisory file lockings. And also, users are usually root and the program is running with same user, so trivial Unix file privilege is not useful.

like image 770
eonil Avatar asked May 03 '13 16:05

eonil


3 Answers

You have to look at the problem that you are trying to actually solve with mandatory locking.

File content integrity is not guaranteed by mandatory locking; unless you keep your file locked 24/7; file integrity will still depend on all processes observing file format/access conventions (and can still fail due to hard drive errors etc.).

What mandatory locking protects you against is programming errors that (by accident, not out of malice) fail to respect the proper locking protocols. At the same time, that protection is only partial, since failure to acquire a lock (mandatory or not) can still lead to file corruption. Mandatory locking can also reduce possible concurrency more than needed. In short, mandatory locking provides more protection than advisory locking against software defects, but the protection is not complete.

One solution to the problem of accidental corruption is to use a library that is aggressively tested for preserving data integrity. One such library (there are others) is SQlite (see also here and here for more information). On OS X, Core Data provides an abstraction layer over SQLite as a data storage. Obviously, such an approach should be complemented by replication/backup so that you have protection against other causes for data corruption where the storage layer cannot help you (media failure, accidental deletion).

Additional protection can be gained by restricting file access to a database and allowing access only through a gateway (such as a socket or messaging library). Then you will just have a single process running that merely acquires a lock (and never releases it). This setup is fairly easy to test; the lock is merely to prevent having more than one instance of the gateway process running.

like image 56
Reimer Behrends Avatar answered Sep 24 '22 00:09

Reimer Behrends


One simple solution would be to simply hide the file from the user until your program is done using it.

There are various ways to hide files. It depends on whether you're modifying an existing file that was previously visible to the user or creating a new file. Even if modifying an existing file, it might be best to create a hidden working copy and then atomically exchange its contents with the file that's visible to the user.

One approach to hiding a file is to create it in a location which is not normally visible to users. (That is, it's not necessary that the file be totally impossible for the user to reach, just out of the way so that they won't stumble on it.) You can obtain such a location using -[NSFileManager URLForDirectory:inDomain:appropriateForURL:create:error:] and passing NSItemReplacementDirectory and NSUserDomainMask for the first two parameters. See -replaceItemAtURL:withItemAtURL:backupItemName:options:resultingItemURL:error: method for how to atomically move the file into its file place.

You can set a file to be hidden using various APIs. You can use -[NSURL setResourceValue:forKey:error:] with the key NSURLIsHiddenKey. You can use the chflags() system call to set UF_HIDDEN. The old Unix standby is to use a filename starting with a period ('.').

like image 37
Ken Thomases Avatar answered Sep 23 '22 00:09

Ken Thomases


Here's some details about this topic: https://developer.apple.com/library/ios/documentation/FileManagement/Conceptual/FileSystemProgrammingGuide/FileCoordinators/FileCoordinators.html

Now I think the basic policy on OSX is something like this.

  • Always allow access by any process.
  • Always be prepared for shared data file mutation.
  • Be notified when other processes mutates the file content, and provide proper response on them. For example you can display an error to end users if other process is trying to access the file. And then users will learn that's bad, and will not do it again.
like image 39
eonil Avatar answered Sep 24 '22 00:09

eonil