Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Win32 API FindFirstFile and FindNextFile performance vs command line

We have encountered an unexpected performance issue when traversing directories looking for files using a wildcard pattern.

We have 180 folders each containing 10,000 files. A command line search using dir <pattern> /s completes almost instantly (<0.25 second). However, from our application the same search takes between 3-4 seconds.

We initially tried using System.IO.DirectoryInfo.GetFiles() with SearchOption.AllDirectories and have now tried the Win32 API calls FindFirstFile() and FindNextFile().

Profiling our code using indicates that the vast majority of execution time is spent on these calls.

Our code is based on the following blog post:

http://codebetter.com/blogs/matthew.podwysocki/archive/2008/10/16/functional-net-fighting-friction-in-the-bcl-with-directory-getfiles.aspx

We found this to be slow so updated the GetFiles function to take a string search pattern rather than a predicate.

Can anyone shed any light on what might be wrong with our approach?

like image 301
Richard Ev Avatar asked Nov 23 '09 10:11

Richard Ev


1 Answers

In my tests using FindFirstFileEx with FindExInfoBasic and FIND_FIRST_EX_LARGE_FETCH is much faster than the plain FindFirstFile.

Scanning 20 folders with ~300,000 files took 661 seconds with FindFirstFile and 11 seconds with FindFirstFileEx. Subsequent calls to the same folders took less than a second.

HANDLE h=FindFirstFileEx(search.c_str(), FindExInfoBasic, &data, FindExSearchNameMatch, NULL, FIND_FIRST_EX_LARGE_FETCH); 
like image 185
KPexEA Avatar answered Sep 23 '22 04:09

KPexEA