Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unique file identifier in windows

Is there are way to uniquely identify a file (and possibly directories) for the lifetime of the file regardless of moves, renames and content modifications? (Windows 2000 and later). Making a copy of a file should give the copy it's own unique identifier.

My application associates various meta-data with individual files. If files are modified, renamed or moved it would be useful to be able to automatically detect and update file associations.

FileSystemWatcher can provide events that inform of these sorts of changes, however it uses a memory buffer that can be easily filled (and events lost) if many file system events occur quickly.

A hash is no use because the content of the file can change, and so the hash will change.

I had thought of using the file creation date, however there are a few situations where this will not be unique (ie. when multiple files are copied).

I've also heard of a file SID (security ID?) in NTFS, but I'm not sure if this would do what I'm looking for.

Any ideas?

like image 258
Ash Avatar asked Dec 08 '09 11:12

Ash


People also ask

Do Windows files have a unique identifier?

This index is unique within a volume, and stays the same even if you move the file (within the volume) or rename it. If you can assume that NTFS is used, you may also want to consider using Alternate Data Streams to store the metadata. On NTFS and REFS the file ID does not change for the lifetime of the file.

What is unique identifier in a file system?

A Unique Identifier (UID) uniquely identifies a resource. This means that the identifier may change for the particular embodiment of the resource and each copy of the resource has its own ID. It consequently means that the UID are URL's.

What is a file ID?

The File ID is a unique identification number that references a specific file or folder. Providing this ID allows us to immediately locate the appropriate information and is also easier to reference than a long file path.


2 Answers

Here's sample code that returns a unique File Index.

ApproachA() is what I came up with after a bit of research. ApproachB() is thanks to information in the links provided by Mattias and Rubens. Given a specific file, both approaches return the same file index (during my basic testing).

Some caveats from MSDN:

Support for file IDs is file system-specific. File IDs are not guaranteed to be unique over time, because file systems are free to reuse them. In some cases, the file ID for a file can change over time.

In the FAT file system, the file ID is generated from the first cluster of the containing directory and the byte offset within the directory of the entry for the file. Some defragmentation products change this byte offset. (Windows in-box defragmentation does not.) Thus, a FAT file ID can change over time. Renaming a file in the FAT file system can also change the file ID, but only if the new file name is longer than the old one.

In the NTFS file system, a file keeps the same file ID until it is deleted. You can replace one file with another file without changing the file ID by using the ReplaceFile function. However, the file ID of the replacement file, not the replaced file, is retained as the file ID of the resulting file.

The first bolded comment above worries me. It's not clear if this statement applies to FAT only, it seems to contradict the second bolded text. I guess further testing is the only way to be sure.

[Update: in my testing the file index/id changes when a file is moved from one internal NTFS hard drive to another internal NTFS hard drive.]

    public class WinAPI     {         [DllImport("ntdll.dll", SetLastError = true)]         public static extern IntPtr NtQueryInformationFile(IntPtr fileHandle, ref IO_STATUS_BLOCK IoStatusBlock, IntPtr pInfoBlock, uint length, FILE_INFORMATION_CLASS fileInformation);          public struct IO_STATUS_BLOCK         {             uint status;             ulong information;         }         public struct _FILE_INTERNAL_INFORMATION {           public ulong  IndexNumber;         }           // Abbreviated, there are more values than shown         public enum FILE_INFORMATION_CLASS         {             FileDirectoryInformation = 1,     // 1             FileFullDirectoryInformation,     // 2             FileBothDirectoryInformation,     // 3             FileBasicInformation,         // 4             FileStandardInformation,      // 5             FileInternalInformation      // 6         }          [DllImport("kernel32.dll", SetLastError = true)]         public static extern bool GetFileInformationByHandle(IntPtr hFile,out BY_HANDLE_FILE_INFORMATION lpFileInformation);          public struct BY_HANDLE_FILE_INFORMATION         {             public uint FileAttributes;             public FILETIME CreationTime;             public FILETIME LastAccessTime;             public FILETIME LastWriteTime;             public uint VolumeSerialNumber;             public uint FileSizeHigh;             public uint FileSizeLow;             public uint NumberOfLinks;             public uint FileIndexHigh;             public uint FileIndexLow;         }   }    public class Test   {        public ulong ApproachA()        {                 WinAPI.IO_STATUS_BLOCK iostatus=new WinAPI.IO_STATUS_BLOCK();                  WinAPI._FILE_INTERNAL_INFORMATION objectIDInfo = new WinAPI._FILE_INTERNAL_INFORMATION();                  int structSize = Marshal.SizeOf(objectIDInfo);                  FileInfo fi=new FileInfo(@"C:\Temp\testfile.txt");                 FileStream fs=fi.Open(FileMode.Open,FileAccess.Read,FileShare.ReadWrite);                  IntPtr res=WinAPI.NtQueryInformationFile(fs.Handle, ref iostatus, memPtr, (uint)structSize, WinAPI.FILE_INFORMATION_CLASS.FileInternalInformation);                  objectIDInfo = (WinAPI._FILE_INTERNAL_INFORMATION)Marshal.PtrToStructure(memPtr, typeof(WinAPI._FILE_INTERNAL_INFORMATION));                  fs.Close();                  Marshal.FreeHGlobal(memPtr);                     return objectIDInfo.IndexNumber;         }         public ulong ApproachB()        {                WinAPI.BY_HANDLE_FILE_INFORMATION objectFileInfo=new WinAPI.BY_HANDLE_FILE_INFORMATION();                  FileInfo fi=new FileInfo(@"C:\Temp\testfile.txt");                 FileStream fs=fi.Open(FileMode.Open,FileAccess.Read,FileShare.ReadWrite);                  WinAPI.GetFileInformationByHandle(fs.Handle, out objectFileInfo);                  fs.Close();                  ulong fileIndex = ((ulong)objectFileInfo.FileIndexHigh << 32) + (ulong)objectFileInfo.FileIndexLow;                  return fileIndex;           }   } 
like image 132
Ash Avatar answered Sep 19 '22 21:09

Ash


If you call GetFileInformationByHandle, you'll get a file ID in BY_HANDLE_FILE_INFORMATION.nFileIndexHigh/Low. This index is unique within a volume, and stays the same even if you move the file (within the volume) or rename it.

If you can assume that NTFS is used, you may also want to consider using Alternate Data Streams to store the metadata.

like image 24
Mattias S Avatar answered Sep 23 '22 21:09

Mattias S