I'm wondering how Python does string comparison, more specifically how it determines the outcome when a less than <
or greater than >
operator is used.
For instance if I put print('abc' < 'bac')
I get True
. I understand that it compares corresponding characters in the string, however its unclear as to why there is more, for lack of a better term, "weight" placed on the fact that a
is less thanb
(first position) in first string rather than the fact that a
is less than b
in the second string (second position).
Using String. equals() :In Java, string equals() method compares the two given strings based on the data/content of the string. If all the contents of both the strings are same then it returns true. If any character does not match, then it returns false.
You can convert a numeric string into integer using Integer. parseInt(String) method, which returns an int type. And then comparison is same as 4 == 4 .
String comparison means to check if the given two string are equal, if first string is greater than second string, or if first string is less than second string. std::string::compare() function in C++ compares two strings and returns a number.
Solution. Following example compares two strings by using str compareTo (string), str compareToIgnoreCase(String) and str compareTo(object string) of string class and returns the ascii difference of first odd characters of compared strings.
From the docs:
The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted.
Also:
Lexicographical ordering for strings uses the Unicode code point number to order individual characters.
or on Python 2:
Lexicographical ordering for strings uses the ASCII ordering for individual characters.
As an example:
>>> 'abc' > 'bac' False >>> ord('a'), ord('b') (97, 98)
The result False
is returned as soon as a
is found to be less than b
. The further items are not compared (as you can see for the second items: b
> a
is True
).
Be aware of lower and uppercase:
>>> [(x, ord(x)) for x in abc] [('a', 97), ('b', 98), ('c', 99), ('d', 100), ('e', 101), ('f', 102), ('g', 103), ('h', 104), ('i', 105), ('j', 106), ('k', 107), ('l', 108), ('m', 109), ('n', 110), ('o', 111), ('p', 112), ('q', 113), ('r', 114), ('s', 115), ('t', 116), ('u', 117), ('v', 118), ('w', 119), ('x', 120), ('y', 121), ('z', 122)] >>> [(x, ord(x)) for x in abc.upper()] [('A', 65), ('B', 66), ('C', 67), ('D', 68), ('E', 69), ('F', 70), ('G', 71), ('H', 72), ('I', 73), ('J', 74), ('K', 75), ('L', 76), ('M', 77), ('N', 78), ('O', 79), ('P', 80), ('Q', 81), ('R', 82), ('S', 83), ('T', 84), ('U', 85), ('V', 86), ('W', 87), ('X', 88), ('Y', 89), ('Z', 90)]
Python string comparison is lexicographic:
From Python Docs: http://docs.python.org/reference/expressions.html
Strings are compared lexicographically using the numeric equivalents (the result of the built-in function ord()) of their characters. Unicode and 8-bit strings are fully interoperable in this behavior.
Hence in your example, 'abc' < 'bac'
, 'a' comes before (less-than) 'b' numerically (in ASCII and Unicode representations), so the comparison ends right there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With