I have a huge directory of about 500k jpg files, and I'd like to archive all files that are older than a certain date. Currently, the script takes hours to run.
This has a lot to do with the very piss-poor performance of GoGrid's storage servers, but at the same time, I'm sure there's a way more efficient way Ram/Cpu wise to accomplish what I'm doing.
Here's the code I have:
var dirInfo = new DirectoryInfo(PathToSource);
var fileInfo = dirInfo.GetFiles("*.*");
var filesToArchive = fileInfo.Where(f =>
f.LastWriteTime.Date < StartThresholdInDays.Days().Ago().Date
&& f.LastWriteTime.Date >= StopThresholdInDays.Days().Ago().Date
);
foreach (var file in filesToArchive)
{
file.CopyTo(PathToTarget+file.Name);
}
The Days().Ago() stuff is just syntactic sugar.
The only part that I think you could improve is the dirInfo.GetFiles("*.*")
. In .NET 3.5 and earlier, it returns an array with all the file names, which takes time to build and uses lots of RAM. In .NET 4.0, there is a new Directory.EnumerateFiles
method that returns an IEnumerable<string>
instead, and fetches results immediately as they are read from the disk. This could improve performance a bit, but don't expect miracles...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With