This is a longshot, but is there a faster way to get size, lastaccessedtime, lastcreated time etc for multiple files?
I have a long list of file paths (so I needn't enumerate) and need to look up that information as quickly as possible. Creating FileInfo's in parallel probably won't help much since the bottleneck should be the disk.
The NTFS Journal only keeps the filenames unfortunately otherwise that' be great, i guess the OS doesn't store that meta information somewhere?
One other optimization that might be done if there's a static or Win32 call (File methods only allows me to get one piece of information at a time though) method that fetches the information rather that creating a bunch of FileInfo objects
Anyways, glad if anyone know something that might help, unfortunately I do have to have to do micro optimization here and no "using a database" isn't a viable answer ;)
There are static methods on System.IO.File
to get what you want. It's a micro-optimization, but it might be what you need: GetLastAccessTime, GetCreationTime.
I'll leave the text above because you specifically asked for static methods. However, I think you are better off using FileInfo
(you should measure just to be sure). Both File and FileInfo uses an internal method on File
called FillAttributeInfo
to get the data you are after. For the properties you need, FileInfo
will need to call this method once. File
will have to call it on each call, since the attribute info object is thrown away when the method finishes (since it's static).
So my hunch is, when you need multiple attributes, a FileInfo
for each file will be faster. But in performance situations, you should always measure ! Faced with this problem, I would try both managed options as outlined above and make a benchmark, both when running in serial and in parallel. Then decide if it's fast enough.
If it is not fast enough, you need to resort into calling the Win32 API directly. It wouldn't be too hard to look at File.FileAttributeInfo
in the reference sources and come up with something similar.
In fact, if you really need it, this is the code required to call the Win32 API directly using the same approach as the internal code for File
does, but using one OS call to get all the attributes. I think you should only use if it is really neccessary. You'll have to parse from FILETIME to a usable datetime yourself, etc, so you get some more work to do manually.
static class FastFile
{
private const int MAX_PATH = 260;
private const int MAX_ALTERNATE = 14;
public static WIN32_FIND_DATA GetFileData(string fileName)
{
WIN32_FIND_DATA data;
IntPtr handle = FindFirstFile(fileName, out data);
if (handle == IntPtr.Zero)
throw new IOException("FindFirstFile failed");
FindClose(handle);
return data;
}
[DllImport("kernel32")]
private static extern IntPtr FindFirstFile(string fileName, out WIN32_FIND_DATA data);
[DllImport("kernel32")]
private static extern bool FindClose(IntPtr hFindFile);
[StructLayout(LayoutKind.Sequential)]
public struct FILETIME
{
public uint dwLowDateTime;
public uint dwHighDateTime;
}
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)]
public struct WIN32_FIND_DATA
{
public FileAttributes dwFileAttributes;
public FILETIME ftCreationTime;
public FILETIME ftLastAccessTime;
public FILETIME ftLastWriteTime;
public int nFileSizeHigh;
public int nFileSizeLow;
public int dwReserved0;
public int dwReserved1;
[MarshalAs(UnmanagedType.ByValTStr, SizeConst = MAX_PATH)]
public string cFileName;
[MarshalAs(UnmanagedType.ByValTStr, SizeConst = MAX_ALTERNATE)]
public string cAlternate;
}
}
.NET's DirectoryInfo and FileInfo classes are incredibly slow in this matter, especially when used with network shares.
If many of the files to be "scanned" are in the same directory, you'll get much faster results (depending on the situation: by dimensions faster) by using the Win32 API's FindFirstFile, FindNextFile and FindClose functions. This is even true if you have to ask for more information that you actually need (e.g. if you ask for all ".log" files in a directory, where you only need 75% of them).
Actually, .NET's info classes also use these Win32 API functions internally. But they only "remmeber" the file names. When asking for more information on a bunch of files (e.g. LastModified), a separate (network) request is made for each file, which taskes time.
Is it possible to use DirectoryInfo class?
DirectoryInfo d = new DirectoryInfo(@"c:\\Temp");
FileInfo[] f= d.GetFiles()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With