I've started learning Python (python 3.3) and I was trying out the is
operator. I tried this:
>>> b = 'is it the space?' >>> a = 'is it the space?' >>> a is b False >>> c = 'isitthespace' >>> d = 'isitthespace' >>> c is d True >>> e = 'isitthespace?' >>> f = 'isitthespace?' >>> e is f False
It seems like the space and the question mark make the is
behave differently. What's going on?
EDIT: I know I should be using ==
, I just wanted to know why is
behaves like this.
The comparison operators also work on strings. To see if two strings are equal you simply write a boolean expression using the equality operator.
Explanation: The * operator can be used to repeat the string for a given number of times. Writing two string literals together also concatenates them like + operator. If we want to concatenate strings in different lines, we can use parentheses.
In python, String operators represent the different types of operations that can be employed on the program's string type of variables. Python allows several string operators that can be applied on the python string are as below: Assignment operator: “=.” Concatenate operator: “+.”
String Operators ¶ There are two string operators. The first is the concatenation operator ('. '), which returns the concatenation of its right and left arguments. The second is the concatenating assignment operator (' .
Warning: this answer is about the implementation details of a specific python interpreter. comparing strings with is
==bad idea.
Well, at least for cpython3.4/2.7.3, the answer is "no, it is not the whitespace". Not only the whitespace:
Two string literals will share memory if they are either alphanumeric or reside on the same block (file, function, class or single interpreter command)
An expression that evaluates to a string will result in an object that is identical to the one created using a string literal, if and only if it is created using constants and binary/unary operators, and the resulting string is shorter than 21 characters.
Single characters are unique.
Alphanumeric string literals always share memory:
>>> x='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' >>> y='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' >>> x is y True
Non-alphanumeric string literals share memory if and only if they share the enclosing syntactic block:
(interpreter)
>>> x='`!@#$%^&*() \][=-. >:"?<a'; y='`!@#$%^&*() \][=-. >:"?<a'; >>> z='`!@#$%^&*() \][=-. >:"?<a'; >>> x is y True >>> x is z False
(file)
x='`!@#$%^&*() \][=-. >:"?<a'; y='`!@#$%^&*() \][=-. >:"?<a'; z=(lambda : '`!@#$%^&*() \][=-. >:"?<a')() print(x is y) print(x is z)
Output: True
and False
For simple binary operations, the compiler is doing very simple constant propagation (see peephole.c), but with strings it does so only if the resulting string is shorter than 21 charcters. If this is the case, the rules mentioned earlier are in force:
>>> 'a'*10+'a'*10 is 'a'*20 True >>> 'a'*21 is 'a'*21 False >>> 'aaaaaaaaaaaaaaaaaaaaa' is 'aaaaaaaa' + 'aaaaaaaaaaaaa' False >>> t=2; 'a'*t is 'aa' False >>> 'a'.__add__('a') is 'aa' False >>> x='a' ; x+='a'; x is 'aa' False
Single characters always share memory, of course:
>>> chr(0x20) is ' ' True
To expand on Ignacio’s answer a bit: The is
operator is the identity operator. It is used to compare object identity. If you construct two objects with the same contents, then it is usually not the case that the object identity yields true. It works for some small strings because CPython, the reference implementation of Python, stores the contents separately, making all those objects reference to the same string content. So the is
operator returns true for those.
This however is an implementation detail of CPython and is generally neither guaranteed for CPython nor any other implementation. So using this fact is a bad idea as it can break any other day.
To compare strings, you use the ==
operator which compares the equality of objects. Two string objects are considered equal when they contain the same characters. So this is the correct operator to use when comparing strings, and is
should be generally avoided if you do not explicitely want object identity (example: a is False
).
If you are really interested in the details, you can find the implementation of CPython’s strings here. But again: This is implementation detail, so you should never require this to work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With