Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Windows Projected File System read only?

Tags:

I tried to play around with Projected File System to implement a user mode ram drive (previously I had used Dokan). I have two questions:

  1. Is this a read-only projection? I could not find anything any notification sent to me when opening the file from say Notepad and writing to it.

  2. Is the file actually created on the disk once I use PrjWriteFileData()? From what I have understood, yes.

In that case what would be any useful thing that one could do with this library if there is no writing to the projected files? It seems to me that the only useful thing is to initially create a directory tree from somewhere else (say, a remote repo), but nothing beyond that. Dokan still seems the way to go.

like image 791
Michael Chourdakis Avatar asked Mar 08 '19 18:03

Michael Chourdakis


1 Answers

The short answer:

  1. It's not read-only but you can't write your files directly to a "source" filesystem via a projected one.
  2. WriteFileData method is used for populating placeholder files on the "scratch" (projected) file system, so, it doesn't affect a "source" file system.

The long answer:

As stated in the comment by @zett42 ProjFS was mainly designed as a remote git file system. So, the main goal of any file versioning system is to handle multiple versions of files. From this a question arise - do we need to override the file inside a remote repository on ProjFS file write? It would be disastrous. When working with git you always write files locally and they are not synced until you push the changes to a remote repository.

When you enumerate files nothing being written to a local file system. From the ProjFS documentation:

When a provider first creates a virtualization root it is empty on the local system. That is, none of the items in the backing data store have yet been cached to disk.

Only after the file is opened ProjFS creates a "placeholder" for it in a local file system - I assume that it's a file with a special structure (not a real one).

As files and directories under the virtualization root are opened, the provider creates placeholders on disk, and as files are read the placeholders are hydrated with contents.

What "hydrated" is mean? Most likely, it represents a special data structure partially filled with real data. I would imaginge a placeholder as a sponge partially filled with data.

As items are opened, ProjFS requests information from the provider to allow placeholders for those items to be created in the local file system. As item contents are accessed, ProjFS requests those contents from the provider. The result is that from the user's perspective, virtualized files and directories appear similar to normal files and directories that already reside on the local file system.

Only after a file is updated (modified). It's not a placeholder anymore - it becomes "Full file/directory":

For files: The file's content (primary data stream) has been modified. The file is no longer a cache of its state in the provider's store. Files that have been created on the local file system (i.e. that do not exist in the provider's store at all) are also considered to be full files.

For directories: Directories that have been created on the local file system (i.e. that do not exist in the provider's store at all) are considered to be full directories. A directory that was created on disk as a placeholder never becomes a full directory.

It means that on the first write the placeholder is replaced by the real file in the local FS. But how to keep a "remote" file in sync with a modified one? (1)

When the provider calls PrjWritePlaceholderInfo to write the placeholder information, it supplies the ContentID in the VersionInfo member of the placeholderInfo argument. The provider should then record that a placeholder for that file or directory was created in this view.

Notice "The provider should then record that a placeholder for that file". It means that in order to sync the file later with a correct view representation we have to remember with which version a modified file is associated. Imagine we are in a git repository and we change the branch. In this case, we may update one file multiple times in different branches. Now, why and when the provider calls PrjWritePlaceholderInfo?

... These placeholders represent the state of the backing store at the time they were created. These cached items, combined with the items projected by the provider in enumerations, constitute the client's "view" of the backing store. From time to time the provider may wish to update the client's view, whether because of changes in the backing store, or because of explicit action taken by the user to change their view.

Once again, imagine switching branches in a git repository; you have to update a file if it's different in another branch. Continuing answering the question (1). Imaging you want to make a "push" from a particular branch. First of all, you have to know which files are modified. If you are not recorded the placeholder info while modifying your file you won't be able to do it correctly (at least for the git repository example).

Remember, that a placeholder is replaced by a real file on modification? A ProjFS has OnNotifyFileHandleClosedFileModifiedOrDeleted event. Here is the signature of the callback:

public void NotifyFileHandleClosedFileModifiedOrDeletedCallback(
    string relativePath,
    bool isDirectory,
    bool isFileModified,
    bool isFileDeleted,
    uint triggeringProcessId,
    string triggeringProcessImageFileName)

For our understanding, the most important parameter for us here is relativePath. It will contain a name of a modified file inside the "scratch" file system (projected). Here you also know that the file is a real file (not a placeholder) and it's written to the disk (that's it you won't be able to intercept the call before the file is written). Now you may copy it to the desired location (or do it later) - it depends on your goals.

Answering the question #2, it seems like PrjWriteFileData is used only for populating "scratch" file system and you cannot use it for updating the "source" file system.

Applications:

As for applications, you still can implement a remote file system (instead of using Dokan) but all writes will be cached locally instead of directly written to a remote location. A couple use case ideas:

  1. Distributed File Systems
  2. Online Drive Client
  3. A File System "Dispatcher" (for example, you may write your files in different folders depending on particular conditions)
  4. A File Versioning System (for example, you may preserve different versions of the same file after a modification)
  5. Mirroring data from your app to a file system (for example, you can "project" a text file with indentations to folders, sub-folders and files)

P.S.: I'm not aware of any undocumented APIs, but from my point of view (accordingly with the documentation) we cannot use ProjFS for purposes like a ramdisk or write files directly to the "source" file system without writing them to the "local" file system first.

like image 79
Pavel Sapehin Avatar answered Nov 15 '22 07:11

Pavel Sapehin