Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Directory.GetFiles() performance issues

Using System.IO.Directory.GetFiles(), I would like to find images .png extension located on NAS server.

string searchingString = "ZLLK9";
// original
var fileList1= Directory.GetFiles(directoryPath).Select(p => new FileInfo(p)).Where(q => q.Name.Substring(0, q.Name.LastIndexOf('.')).Split('_').First() == searchingString);
// fixed    
var fileList2 = Directory.GetFiles(directoryPath, string.Format("{0}_*.png", searchingString));

There are two ways to find out files contain "ZLLKK9" words.

The first 'original' way using LINQ is too slow to find out the files. The performance issues are up but I don't know what is different with 'fixed' way?

I need help for understanding the difference with two ways carefully.

like image 744
FragrantJH Avatar asked Sep 21 '25 10:09

FragrantJH


1 Answers

The first way is slow for 2 reasons:

  • You're constructing a FileInfo object for each file. There's no need for this if all you want is the file name. Constructing a FileInfo is relatively light, but it's unnecessary and all the instantiations will slow you down if you're querying a lot of files. Since all you really need is the file's name, you can do without this extra step.

  • The LINQ approach retrieves everything, then filters afterwards. It's much more efficient (and faster) to get the file system to do the filtering for you.

If you still want to use LINQ, here's a more performant version of your query, which cuts out a lot of enumeration and string manipulation:

var fileList1 = Directory.GetFiles(directoryPath).Where(
    path => Regex.IsMatch(Path.GetFileName(path), @"^ZLLK9_.*\.png$"));
like image 53
Simon MᶜKenzie Avatar answered Sep 23 '25 00:09

Simon MᶜKenzie