Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

c# EnumerateFiles wildcard returning non matches?

Tags:

c#

.net

As a simplified example I am executing the following

IEnumerable<string> files = Directory.EnumerateFiles(path, @"2010*.xml", 
    SearchOption.TopDirectoryOnly).ToList();

In my results set I am getting a few files which do no match the file pattern. According to msdn searchPattern wildcard is "Zero or more characters" and not a reg ex. An example is that I am getting a file name back as "2004_someothername.xml".

For information there are in excess of 25,000 files in the folder.

Does anyone have any idea what is going on?

like image 340
Sam Underhill Avatar asked Mar 10 '11 15:03

Sam Underhill


1 Answers

This is due to how Windows does wildcard matching - it includes the encoded 8.3 filenames in its wildcard search, resulting in some surprising matches!

A way to get around this bug is to retest all file results that come back through the OS wildcard match and test with a manual comparison of the wildcard to each (long) file name. Another way is to turn off 8.3 filenames altogether via the registry. I have been burned by this on numerous occasions, including having important (non-matching) files get deleted via a wildcard based del command from the command prompt.

To summarize, be very careful, especially if you have many files in a directory on making any critical production decisions or taking any actions based on an OS file/wildcard match, without a secondary verification of results.

Here is an explanation of this bizarre behavior.

Another explanation from O'Reilly's site.

like image 128
Michael Goldshteyn Avatar answered Nov 08 '22 21:11

Michael Goldshteyn