I had some problems getting this to work:
# Shortened for brevity
def _coerce_truth(word):
TRUE_VALUES = ('true','1','yes')
FALSE_VALUES = ('false','0','no')
_word = word.lower().strip()
print "t" in _word
if _word in TRUE_VALUES:
return True
elif _word in FALSE_VALUES:
return False
I discovered:
In [20]: "foo" is "Foo".lower()
Out[20]: False
In [21]: "foo" is "foo".lower()
Out[21]: False
In [22]: "foo" is "foo"
Out[22]: True
In [23]: "foo" is "foo".lower()
Out[23]: False
Why is this? I understand that identity is different then equality, but when is identity formed? Statement 22 should be False
unless, due to the static nature of strings, id == eq. In this case I'm confused by statement 23.
Please explain and thanks in advance.
How would you confirm that 2 strings have the same identity? The is operator returns True if 2 names point to the same location in memory. This is what we're referring to when we talk about identity. Don't confuse is with ==, the latter which only tests equality.
Python has the usual comparison operations: ==, != , <, <=, >, >=. Unlike Java and C, == is overloaded to work correctly with strings. The boolean operators are the spelled out words *and*, *or*, *not* (Python does not use the C-style && || !).
String Comparison using == in PythonThe == function compares the values of two strings and returns if they are equal or not. If the strings are equal, it returns True, otherwise it returns False.
Q. "When is identity formed?"
A. When the object is created.
What you're seeing is actually an implementation detail of Cpython -- It caches small strings and reuses them for efficiency. Other cases that are interesting are:
"foo" is "foo".strip() # True
"foo" is "foo"[:] # True
Ultimately, what we see is that the string literal "foo"
has been cached. Every time you type "foo"
, you're referencing the same object in memory. However, some string methods will choose to always create new objects (like .lower()
) and some will smartly re-use the input string if the method made no changes (like .strip()
).
One benefit of this is that string equality can be implemented by a pointer compare (blazingly fast) followed by a character-by-character comparison if the pointer comparison is false. If the pointer comparison is True, then the character-by-character comparison can be avoided.
As for relation between is
and in
:
The __contains__
method (which stands behind in
operator) for tuple
and list
while looking for a match, first checks the identity and if that fails checks for equality. This gives you sane results even with objects that don't compare equal to themselves:
>>> x = float("NaN")
>>> t = (1, 2, x)
>>> x in (t)
True
>>> any(x == e for e in t) # this might be suprising
False
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With