Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Upper vs Lower Case

When doing case-insensitive comparisons, is it more efficient to convert the string to upper case or lower case? Does it even matter?

It is suggested in this SO post that C# is more efficient with ToUpper because "Microsoft optimized it that way." But I've also read this argument that converting ToLower vs. ToUpper depends on what your strings contain more of, and that typically strings contain more lower case characters which makes ToLower more efficient.

In particular, I would like to know:

  • Is there a way to optimize ToUpper or ToLower such that one is faster than the other?
  • Is it faster to do a case-insensitive comparison between upper or lower case strings, and why?
  • Are there any programming environments (eg. C, C#, Python, whatever) where one case is clearly better than the other, and why?
like image 864
Parappa Avatar asked Oct 24 '08 17:10

Parappa


People also ask

What is upper case and lower case?

Uppercase letters are capital letters—the bigger, taller versions of letters (like W), as opposed to the smaller versions, which are called lowercase letters (like w). Uppercase means the same thing as capital. Uppercase letters can also be called capitals.

How do you write upper case and lower case?

To use a keyboard shortcut to change between lowercase, UPPERCASE, and Capitalize Each Word, select the text and press SHIFT + F3 until the case you want is applied.

What is uppercase and lowercase letter in password?

An uppercase password is one that comprises capital letters. An “Apple” is an example of an uppercase password. However, the majority of platforms require a combination of lowercase and uppercase letters. A “fRuits” is an example of a password that has both lowercase and uppercase letters.


3 Answers

Converting to either upper case or lower case in order to do case-insensitive comparisons is incorrect due to "interesting" features of some cultures, particularly Turkey. Instead, use a StringComparer with the appropriate options.

MSDN has some great guidelines on string handling. You might also want to check that your code passes the Turkey test.

EDIT: Note Neil's comment around ordinal case-insensitive comparisons. This whole realm is pretty murky :(

like image 154
Jon Skeet Avatar answered Oct 13 '22 09:10

Jon Skeet


From Microsoft on MSDN:

Best Practices for Using Strings in the .NET Framework

Recommendations for String Usage

  • Use the String.ToUpperInvariant method instead of the String.ToLowerInvariant method when you normalize strings for comparison.

Why? From Microsoft:

Normalize strings to uppercase

There is a small group of characters that when converted to lowercase cannot make a round trip.

What is example of such a character that cannot make a round trip?

  • Start: Greek Rho Symbol (U+03f1) ϱ
  • Uppercase: Capital Greek Rho (U+03a1) Ρ
  • Lowercase: Small Greek Rho (U+03c1) ρ

ϱ , Ρ , ρ

.NET Fiddle

Original: ϱ
ToUpper: Ρ
ToLower: ρ

That is why, if your want to do case insensitive comparisons you convert the strings to uppercase, and not lowercase.

So if you have to choose one, choose Uppercase.

like image 39
Ian Boyd Avatar answered Oct 13 '22 09:10

Ian Boyd


According to MSDN it is more efficient to pass in the strings and tell the comparison to ignore case:

String.Compare(strA, strB, StringComparison.OrdinalIgnoreCase) is equivalent to (but faster than) calling

String.Compare(ToUpperInvariant(strA), ToUpperInvariant(strB), StringComparison.Ordinal).

These comparisons are still very fast.

Of course, if you are comparing one string over and over again then this may not hold.

like image 29
Rob Walker Avatar answered Oct 13 '22 08:10

Rob Walker