When in a Python interactive session:
In [1]: a = "my string"
In [2]: b = "my string"
In [3]: a == b
Out[3]: True
In [4]: a is b
Out[4]: False
In [5]: import sys
In [6]: print(sys.version)
3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609]
On the other hand, when running the following program:
#!/usr/bin/env python
import sys
def test():
a = "my string"
b = "my string"
print(a == b)
print(a is b)
if __name__ == "__main__":
test()
print(sys.version)
The output is:
True
True
3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609]
Why a is b
has different outcome in the above two cases?
I am aware of this answer (and of course the difference between the ==
and is
operators! that is the point of the question!) but aren't a
and b
the same object also in the first case? (interpeter?) since they point to the same (immutable) string?
Just like most other modern programming languages, Python also does String Interning to gain a performance boost. In Python, we can find if two objects are referring to the same in-memory object using the is operator.
Strings are stored as individual characters in a contiguous memory location. It can be accessed from both directions: forward and backward. Characters are nothing but symbols. Strings are immutable Data Types in Python, which means that once a string is created, it cannot be changed.
String Interning is a method of storing only one copy of each distinct String Value, which must be immutable. By applying String. intern() on a couple of strings will ensure that all strings having the same contents share the same memory.
By using intern you ensure that you never create two string objects that have the same value - when you request the creation of a second string object with the same value as an existing string object, you receive a reference to the pre-existing string object. This way, you are saving memory.
This is caused by string interning. See this question for another example.
In your example, CPython interns the string constants in the module but doesn't in the REPL.
So the console creates two different objects when creating two strings, but the interpreter, when running code in a single function will reuse the memory location of identical strings. Here is how to check if this is happening to you:
a = "my string"
b = "my string"
print id(a)
print id(b)
If these two ids are the same, then a is b
will return True
, if not then it will return False
Looks like you are using anaconda, so I checked this in the console and found different ids and then wrote a function in the editor and executed it and got the same ids.
Note: Now that we know that is
determines if two variable labels point to the same object in memory, I should say that is
should be used sparingly. It is usually used to compare singletons like None a is None
, for example. So don't use it to compare objects, use ==
, and when creating classes implement the __eq__
method so you can use the ==
operator.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With