I've already build a recursive function to get the directory size of a folder path. It works, however with the growing number of directories I have to search through (and number of files in each respective folder), this is a very slow, inefficient method.
static string GetDirectorySize(string parentDir)
{
long totalFileSize = 0;
string[] dirFiles = Directory.GetFiles(parentDir, "*.*",
System.IO.SearchOption.AllDirectories);
foreach (string fileName in dirFiles)
{
// Use FileInfo to get length of each file.
FileInfo info = new FileInfo(fileName);
totalFileSize = totalFileSize + info.Length;
}
return String.Format(new FileSizeFormatProvider(), "{0:fs}", totalFileSize);
}
This is searches all subdirectories for the argument path, so the dirFiles
array gets quite large. Is there a better method to accomplish this? I've searched around but haven't found anything yet.
Another idea that crossed my mind was putting the results in a cache and when the function is called again, try and find the differences and only re-search folders that have changed. Not sure if that's a good thing either...
Right-click the file and click Properties. The image below shows that you can determine the size of the file or files you have highlighted from in the file properties window. In this example, the chrome. jpg file is 18.5 KB (19,032 bytes), and that the size on disk is 20.0 KB (20,480 bytes).
Right-click on the folder you want to view the size in File Explorer. Select “Properties.” The File Properties dialogue box will appear displaying the folder “Size” and its “Size on disk.” It will also show the file contents of those particular folders.
To calculate the size of a folder in C#, use the Directory. EnumerateFiles Method and get the files. Creates all directories and subdirectories in the specified path unless they already exist. Creates all the directories in the specified path, unless the already exist, applying the specified Windows security.
You are first scanning the tree to get a list of all files. Then you are reopening every file to get its size. This amounts to scanning twice.
I suggest you use DirectoryInfo.GetFiles
which will hand you FileInfo
objects directly. These objects are pre-filled with their length.
In .NET 4 you can also use the EnumerateFiles
method which will return you a lazy IEnumable
.
This is more cryptic but it took about 2 seconds for 10k executions.
public static long GetDirectorySize(string parentDirectory)
{
return new DirectoryInfo(parentDirectory).GetFiles("*.*", SearchOption.AllDirectories).Sum(file => file.Length);
}
Try
DirectoryInfo DirInfo = new DirectoryInfo(@"C:\DataLoad\");
Stopwatch sw = new Stopwatch();
try
{
sw.Start();
Int64 ttl = 0;
Int32 fileCount = 0;
foreach (FileInfo fi in DirInfo.EnumerateFiles("*", SearchOption.AllDirectories))
{
ttl += fi.Length;
fileCount++;
}
sw.Stop();
Debug.WriteLine(sw.ElapsedMilliseconds.ToString() + " " + fileCount.ToString());
}
catch (Exception Ex)
{
Debug.WriteLine(Ex.ToString());
}
This did 700,000 in 70 seconds on desktop NON-RAID P4. So like 10,000 a second. On server class machine should get 100,000+ / second easy.
As usr (+1) said EnumerateFile is pre-filled with length.
You may start to speed up a little bit your function using EnumerateFiles()
instead of GetFiles()
. At least you won't load the full list in memory.
If it's not enough you should make your function more complex using threads (one thread per directory is too much but there is not a general rule).
You may use a fixed number of threads that peeks directories from a queue, each thread calculates the size of a directory and adds to the total. Something like:
You may improve a lot the algorithm spanning the search of directories across all threads (for example when a thread parse a directory it adds folders to the queue). Up to you to make it more complicated if you see it's too slow (this task has been used by Microsoft as an example for the new Task Parallel Library).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With