When doing case-insensitive comparisons, is it more efficient to convert the string to upper case or lower case? Does it even matter?
It is suggested in this SO post that C# is more efficient with ToUpper because "Microsoft optimized it that way." But I've also read this argument that converting ToLower vs. ToUpper depends on what your strings contain more of, and that typically strings contain more lower case characters which makes ToLower more efficient.
In particular, I would like to know:
Uppercase letters are capital letters—the bigger, taller versions of letters (like W), as opposed to the smaller versions, which are called lowercase letters (like w). Uppercase means the same thing as capital. Uppercase letters can also be called capitals.
To use a keyboard shortcut to change between lowercase, UPPERCASE, and Capitalize Each Word, select the text and press SHIFT + F3 until the case you want is applied.
An uppercase password is one that comprises capital letters. An “Apple” is an example of an uppercase password. However, the majority of platforms require a combination of lowercase and uppercase letters. A “fRuits” is an example of a password that has both lowercase and uppercase letters.
Converting to either upper case or lower case in order to do case-insensitive comparisons is incorrect due to "interesting" features of some cultures, particularly Turkey. Instead, use a StringComparer with the appropriate options.
MSDN has some great guidelines on string handling. You might also want to check that your code passes the Turkey test.
EDIT: Note Neil's comment around ordinal case-insensitive comparisons. This whole realm is pretty murky :(
From Microsoft on MSDN:
Best Practices for Using Strings in the .NET Framework
Recommendations for String Usage
- Use the String.ToUpperInvariant method instead of the String.ToLowerInvariant method when you normalize strings for comparison.
Why? From Microsoft:
Normalize strings to uppercase
There is a small group of characters that when converted to lowercase cannot make a round trip.
What is example of such a character that cannot make a round trip?
ϱ , Ρ , ρ
.NET Fiddle
Original: ϱ
ToUpper: Ρ
ToLower: ρ
That is why, if your want to do case insensitive comparisons you convert the strings to uppercase, and not lowercase.
So if you have to choose one, choose Uppercase.
According to MSDN it is more efficient to pass in the strings and tell the comparison to ignore case:
String.Compare(strA, strB, StringComparison.OrdinalIgnoreCase) is equivalent to (but faster than) calling
String.Compare(ToUpperInvariant(strA), ToUpperInvariant(strB), StringComparison.Ordinal).
These comparisons are still very fast.
Of course, if you are comparing one string over and over again then this may not hold.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With