Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

1-length string comparison gives different result than character comparison... why?

I am quite new in C# and I found something unexpected in string comparison which I don't really understand.

Can someone please explain me why the comparison between characters gave the opposite result as the comparison of one character length strings in the following code ?

I expected that "9" < "=" will be true (as unicode code of '9' (57) is less than unicode code of '=' (61) ) but it is false... What is the comparison logic of strings behind and why is it different than comparing the characters ?

Code:

bool resChComp = '9' < '=';
bool resStrComp = String.Compare("9", "=") < 0;

Console.WriteLine($"\n'9' < '=' : {resChComp}, \"9\" < \"=\" : { resStrComp }");

Output:

'9' < '=' : True, "9" < "=" : False
like image 576
Frank Avatar asked May 16 '18 16:05

Frank


People also ask

How do you compare two strings of different lengths?

if you use std::string, for comparison you can either use the compare function or use relational operators, provided with this class. if (str1 != str2) std::cout << "str1 and str2 are not equal\n"; all other relational operators, i.e., == < > <= >= are also available.

How do you compare string elements with characters?

strcmp() in C/C++ This function is used to compare the string arguments. It compares strings lexicographically which means it compares both the strings character by character. It starts comparing the very first character of strings until the characters of both strings are equal or NULL character is found.

What does string comparison mean?

string= compares two strings and is true if they are the same (corresponding characters are identical) but is false if they are not. The function equal calls string= if applied to two strings. The keyword arguments :start1 and :start2 are the places in the strings to start the comparison.

Is string comparison slow?

Comparing two strings is very slow and expensive. Most algorithms require iterating through entire string and matching each character. Here, the worst case algorithm will compare 4 bits.


1 Answers

The default string comparison is doing a 'word sort'. From the documentation,

The .NET Framework uses three distinct ways of sorting: word sort, string sort, and ordinal sort. Word sort performs a culture-sensitive comparison of strings. Certain nonalphanumeric characters might have special weights assigned to them. For example, the hyphen ("-") might have a very small weight assigned to it so that "coop" and "co-op" appear next to each other in a sorted list. String sort is similar to word sort, except that there are no special cases. Therefore, all nonalphanumeric symbols come before all alphanumeric characters. Ordinal sort compares strings based on the Unicode values of each element of the string.

The comparison you are expecting is the ordinal comparison, which you can get by using StringComparison.Ordinal in the String.Compare overload, like so:

bool resStrComp = String.Compare("9", "=", StringComparison.Ordinal) < 0;

This will compare the strings by using their unicode values, in the same way comparing a character to another character does.

like image 112
Jonathon Chase Avatar answered Oct 08 '22 20:10

Jonathon Chase