Python string identity: `is` and `in` statements [duplicate]

Tags:

string

I had some problems getting this to work:

# Shortened for brevity
def _coerce_truth(word):
    TRUE_VALUES = ('true','1','yes')
    FALSE_VALUES = ('false','0','no')

    _word = word.lower().strip()
    print "t" in _word
    if _word in TRUE_VALUES:
        return True
    elif _word in FALSE_VALUES:
        return False

I discovered:

In [20]: "foo" is "Foo".lower()
Out[20]: False

In [21]: "foo" is "foo".lower()
Out[21]: False

In [22]: "foo" is "foo"
Out[22]: True

In [23]: "foo" is "foo".lower()
Out[23]: False

Why is this? I understand that identity is different then equality, but when is identity formed? Statement 22 should be False unless, due to the static nature of strings, id == eq. In this case I'm confused by statement 23.

Please explain and thanks in advance.

504

asked Sep 19 '13 20:09

Aaron Schif

2 Answers

Q. "When is identity formed?"

A. When the object is created.

What you're seeing is actually an implementation detail of Cpython -- It caches small strings and reuses them for efficiency. Other cases that are interesting are:

"foo" is "foo".strip()  # True
"foo" is "foo"[:]       # True

Ultimately, what we see is that the string literal "foo" has been cached. Every time you type "foo", you're referencing the same object in memory. However, some string methods will choose to always create new objects (like .lower()) and some will smartly re-use the input string if the method made no changes (like .strip()).

One benefit of this is that string equality can be implemented by a pointer compare (blazingly fast) followed by a character-by-character comparison if the pointer comparison is false. If the pointer comparison is True, then the character-by-character comparison can be avoided.

131

answered Sep 16 '22 14:09

mgilson

As for relation between is and in:

The __contains__ method (which stands behind in operator) for tuple and list while looking for a match, first checks the identity and if that fails checks for equality. This gives you sane results even with objects that don't compare equal to themselves:

>>> x = float("NaN")
>>> t = (1, 2, x)
>>> x in (t)
True
>>> any(x == e for e in t) # this might be suprising
False

answered Sep 17 '22 14:09

lqc

Related questions
                            
                                Add files from one tar into another tar in python
                            
                                How to pass file contents to a python script on windows
                            
                                Pandas HDFStore of MultiIndex DataFrames: how to efficiently get all indexes
                            
                                How to get inserted_primary_key from db.engine.connect().execute call
                            
                                Combine multiple heatmaps in matplotlib
                            
                                Django CreateView is not saving object
                            
                                How to rename a file with non-ASCII character encoding to ASCII
                            
                                Escape space in filepath
                            
                                UTF-8 percentage encoding and python
                            
                                Speed up Numpy Meshgrid Command
                            
                                python incorrect rounding with floating point numbers
                            
                                PyMinuit vs IMinuit
                            
                                Problems installing/importing Basemap
                            
                                scrapy User timeout caused connection failure
                            
                                Matplotlib FuncAnimation only draws one frame
                            
                                Building a table from Python nested dictionaries with missing values
                            
                                matshow with sparse matrices
                            
                                Write to CSV from sqlite3 database in python
                            
                                matplotlib contour input array order
                            
                                Technique for using std::ifstream, std::ofstream in python via SWIG?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With