Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can you explain this bizarre crash in the .NET runtime?

My C# application sends me a stack trace when it throws an unhandled exception, and I'm looking at one now that I don't understand.

It looks as though this can't possibly be my fault, but usually when I think that I'm subsequently proved wrong. 8-) Here's the stack trace:

mscorlib caused an exception (ArgumentOutOfRangeException): startIndex cannot be larger than length of string.
Parameter name: startIndex
   System.String::InternalSubStringWithChecks(Int32 startIndex, Int32 length, Boolean fAlwaysCopy) + 6c
   System.String::Substring(Int32 startIndex) + 0
   System.IO.Directory::InternalGetFileDirectoryNames(String path, String userPathOriginal, String searchPattern, Boolean includeFiles, Boolean includeDirs, SearchOption searchOption) + 149
   System.IO.Directory::GetFiles(String path, String searchPattern, SearchOption searchOption) + 1c
   System.IO.Directory::GetFiles(String path) + 0
   EntrianSourceSearch.Index::zz18ez() + 19b
   EntrianSourceSearch.Index::zz18dz() + a

So my code (the obfuscated function names at the end) calls System.IO.Directory.GetFiles(path) which crashes with a string indexing problem.

Sadly I don't know the value of path that was passed in, but regardless of that, surely it shouldn't be possible for System.IO.Directory::GetFiles to crash like that? Try as I might I can't come up with any argument to GetFiles that reproduces the crash.

Am I really looking at a bug in the .NET runtime, or is there something that could legitimately cause this exception? (I could understand things going wrong if the directory was being changed at the time I called GetFiles, but I wouldn't expect a string indexing exception in that case.)

Edit: Thanks to everyone for their thoughts! The most likely theory so far is that there's a pathname with dodgy non-BMP Unicode characters in it, but I still can't make it break. Looking at the code in GetFiles with Reflector, I think the only way it can break is for GetDirectoryName() to return a path that's longer than its input, even when its input is already fully normalised. Bizarre. I've tried making pathnames with non-BMP characters in (I've never had a directory called {MUSICAL SYMBOL G CLEF} before 8-) but I still can't make it break.

What I've done is add additional logging around the failing code (and made sure my logging works with non-BMP characters!). If it happens again, I'll have a lot more information.

like image 584
RichieHindle Avatar asked Aug 24 '09 21:08

RichieHindle


1 Answers

You can try looking into the code for System.IO.Path.GetFiles() with .NET Reflector. From a quick look it apparently only calls String.Substring() to split something from the end of the path and adds it back near the end of the method. It checks Path.DirectorySeparatorChar (the backslash, '\') and Path.AltDirectorySeparatorChar (the slash, '/') to determine the index and length of the substring.

My guess would be that invalid or unicode file or folder names are confusing the method.

like image 159
Lucas Avatar answered Nov 14 '22 23:11

Lucas