Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does OrdinalIgnoreCase and InvariantCultureIgnoreCase return different results?

I thought StringComparison.OrdinalIgnoreCase and StringComparison.InvariantCultureIgnoreCase do the same job when it comes to English-only strings. However it's not the case in the following code that I'm working on:

// Returns 0
string.Compare("877495169FA05B9D8639A0EBC42022338F7D2324","‎877495169fa05b9d8639a0ebc42022338f7d2324", StringComparison.InvariantCultureIgnoreCase)

// Returns -1
string.Compare("877495169FA05B9D8639A0EBC42022338F7D2324","‎877495169fa05b9d8639a0ebc42022338f7d2324", StringComparison.OrdinalIgnoreCase)

Is there a particular reason why?

like image 238
Todd Li Avatar asked Mar 23 '13 01:03

Todd Li


People also ask

What is the difference between InvariantCultureIgnoreCase and OrdinalIgnoreCase?

InvariantCultureIgnoreCase uses comparison rules based on english, but without any regional variations. This is good for a neutral comparison that still takes into account some linguistic aspects. OrdinalIgnoreCase compares the character codes without cultural aspects.

What does StringComparison OrdinalIgnoreCase do?

OrdinalIgnoreCase members of the new StringComparison enumeration. These enforce a byte-by-byte comparison similar to strcmp that not only avoids bugs from linguistic interpretation of essentially symbolic strings, but provides better performance.

What is string ordinal comparison?

Ordinal comparisons are string comparisons in which each byte of each string is compared without linguistic interpretation; for example, "windows" does not match "Windows".

What is ordinal sort rules?

An operation that uses ordinal sort rules performs a comparison based on the numeric value (Unicode code point) of each Char in the string. An ordinal comparison is fast but culture-insensitive.


1 Answers

"‎877495169fa05b9d8639a0ebc42022338f7d2324"

Sounds like a trick question. There's an extra character at the start at this string, before the first digit 8. It isn't visible in the browser. It is U+200E, "Left to Right Mark". The ordinal comparison sees that character, the invariant comparison ignores it. You can see it for yourself by using ToCharArray() on the string.

Delete that string and paste this one instead, I removed U+200E from it:

"877495169fa05b9d8639a0ebc42022338f7d2324"

And the Compare() method now returns 0 like it should. Do watch out for that text editor or IME you are using right now. Isn't Unicode fun?

like image 90
Hans Passant Avatar answered Jan 15 '23 13:01

Hans Passant