Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does the culture of the StringComparison type of String.Equals matter?

Tags:

c#

In C#, you can compare two strings with String.Equals and supply a StringComparison.

I've recently been looking to update my archaic method of comparing ToLower() because I read that it doesn't work on all languages/cultures.

From what I can tell, the comparison types are used to determine order when confronted with a list containing and ae as to which should appear first (some cultures order things differently).

With string.Equals, ordering is not important. Therefore is it safe to assume that many of the options are irrelevent, and only [Ordinal] and [Ordinal]IgnoreCase are important?

The MSDN article for String.Equals says

The comparisonType parameter indicates whether the comparison should use the current or invariant culture, honor or ignore the case of the two strings being compared, or use word or ordinal sort rules.

string.Equals(myString, theirString, StringComparison.OrdinalIgnoreCase)

I'd also be interested to know how the sort method works internally, does it use String.Compare to work out the relative positioning of two strings?

like image 374
NibblyPig Avatar asked Aug 21 '12 15:08

NibblyPig


People also ask

What is culture in string comparison?

The StringComparison enumeration is used to specify whether a string comparison should use the current culture or the invariant culture, word or ordinal sort rules, and be case-sensitive or case-insensitive. When you call a string comparison method such as String. Compare, String. Equals, or String.

How do I compare STD strings?

In order to compare two strings, we can use String's strcmp() function. The strcmp() function is a C library function used to compare two strings in a lexicographical manner. The function returns 0 if both the strings are equal or the same.

What is an ordinal string?

Ordinal comparisons are string comparisons in which each byte of each string is compared without linguistic interpretation; for example, "windows" does not match "Windows".


1 Answers

Case insensitive comparisons are culture dependent. For example using Turkish culture, i is not lowercase for I. With that culture I is paired with ı, and İ is paired with i. See Dotted and dotless I on Wikipedia.

There are a number of weird effects related to culture sensitive string operations. For example "KonNy".StartsWith("Kon") can return false.

So I recommend switching to culture insensitive operations even for seemingly harmless operations.


And even with culture insensitive operations there is plenty of unintuitive behavior in unicode, such as multiple representations of the same glyph, different codepoints that look identical, zero-width characters that are ignored by some operations, but observed by others,...

like image 120
CodesInChaos Avatar answered Nov 15 '22 19:11

CodesInChaos