Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String Comparison And Alphabetic Order of Individual Characters

I have a question related to string comparison vs. character comparison.

Characters > and 0 (zero) have following decimal values 62 and 48 accordingly.

When I compare two characters in the following code, I get value True (which is correct)

Console.WriteLine('>' > '0');

When I compare two one-character strings in the following code, I get value -1 which indicates that ">" is less than "0" (default culture is English)

Console.WriteLine(string.Compare(">", "0"));

Whereas comparison of "3" and "1" (51 and 49 code values) in the following code returns 1 (as expected)

Console.WriteLine(string.Compare("3", "1"));

Also, string.Compare(string str1, string str2) documentation says:

The comparison uses the current culture to obtain culture-specific information such as casing rules and the alphabetic order of individual characters

Would you be able to explain (or provide reference to some documentation) how string comparison is implemented e.g. how alphabetic order of individual characters is calculated etc?

like image 296
Alexandar Avatar asked Feb 19 '13 21:02

Alexandar


People also ask

How do you compare strings and characters?

strcmp is used to compare two different C strings. When the strings passed to strcmp contains exactly same characters in every index and have exactly same length, it returns 0. For example, i will be 0 in the following code: char str1[] = "Look Here"; char str2[] = "Look Here"; int i = strcmp(str1, str2);

Can you compare strings alphabetically in C++?

Yes, as long as all of the characters in both strings are of the same case, and as long as both strings consist only of letters, this will work.

How do you compare two strings alphabetically in Python?

Comparing Strings with <, >, <=, and >= To compare strings alphabetically, you can use the operators <, >, <=, >=.


2 Answers

The sort order of strings depends on the culture you use.

StringComparer.CurrentCulture sorts the following 1-character strings as follows on my machine:

' -   ! " # $ % & (  ) * , . / : ; ? @ [
\ ] ^ _ ` { | } ~ +  < = > 0 1 2 3 4 5 6
7 8 9 a A b B c C d  D e E f F g G h H i
I j J k K l L m M n  N o O p P q Q r R s
S t T u U v V w W x  X y Y z Z

StringComparer.Ordinal sorts the same strings as follows:

  ! " # $ % & ' ( )  * + , - . / 0 1 2 3
4 5 6 7 8 9 : ; < =  > ? @ A B C D E F G
H I J K L M N O P Q  R S T U V W X Y Z [
\ ] ^ _ ` a b c d e  f g h i j k l m n o
p q r s t u v w x y  z { | } ~
like image 90
dtb Avatar answered Sep 21 '22 17:09

dtb


When you compare the characters '>' and '0', you are comparing their ordinal values.

To get the same behaviour from a string comparison, supply the ordinal string comparison type:

  Console.WriteLine(string.Compare(">", "0", StringComparison.Ordinal));
  Console.WriteLine(string.Compare(">", "0", StringComparison.InvariantCulture));
  Console.WriteLine(string.Compare(">", "0", StringComparison.CurrentCulture));

The current culture is used by default, which has a sorting order intended to sort strings 'alphabetically' rather in strictly lexical order, for some definition of alphabetically.

like image 32
Pete Kirkham Avatar answered Sep 18 '22 17:09

Pete Kirkham