Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can ( s is "" ) and ( s == "" ) ever give different results in Python 2.6.2?

As any Python programmer knows, you should use == instead of is to compare two strings for equality. However, are there actually any cases where ( s is "" ) and ( s == "" ) will give different results in Python 2.6.2?

I recently came across code that used ( s is "" ) in code review, and while pointing out that this was incorrect I wanted to give an example of how this could fail. But try as I might, I can't construct two empty strings with different identities. It seems that the Python implementation must special-case the empty string in lots of common operations. For example:

>>> a = ""
>>> b = "abc"[ 2:2 ]
>>> c = ''.join( [] )
>>> d = re.match( '()', 'abc' ).group( 1 )
>>> e = a + b + c + d 
>>> a is b is c is d is e
True

However, this question suggests that there are cases where ( s is "" ) and ( s == "" ) can be different. Can anyone give me an example?

like image 413
jchl Avatar asked Jul 02 '10 11:07

jchl


4 Answers

Python is tests the objects identity and not equality. Here is an example where using is and == gives a different result:

>>> s=u""
>>> print s is ""
False
>>> print s==""
True
like image 181
zoli2k Avatar answered Oct 17 '22 01:10

zoli2k


As everyone else has said, don't rely on undefined behaviour. However, since you asked for a specific counterexample for Python 2.6, here it is:

>>> s = u"\xff".encode('ascii', 'ignore')
>>> s
''
>>> id(s)
10667744
>>> id("")
10666064
>>> s == ""
True
>>> s is ""
False
>>> type(s) is type("")
True

The only time that Python 2.6 can end up with an empty string which is not the normal empty string is when it does a string operation and it isn't sure about in advance how long the string will be. So when you encode a string the error handler can end up stripping characters and fixes up the buffer size after it has completed. Of course that's an oversight and could easily change in Python 2.7.

like image 37
Duncan Avatar answered Oct 17 '22 01:10

Duncan


You shouldn't care. Unlike None which is defined to be a singleton, there is no rule that says there is only one empty string object. So the result of s is "" is implementation-dependent and using is is a NO-NO whether you can find an example or not.

like image 7
John Machin Avatar answered Oct 17 '22 01:10

John Machin


It seems to work for anything which actually is a string, but something which just looks like a string (e.g. a unicode or subclass of str or something similar) will fail.

>>> class mysub(str):
    def __init__(self, *args, **kwargs):
        super(mysub, self).__init__(*args, **kwargs)

>>> 
>>> q = mysub("")
>>> q is ""
False
>>> q == ""
True

edit:

For the purposes of code review & feedback I would suggest that it was bad practice because it implements an unexpected test (even if we ignore the uncertainty of whether it will always behave the same when the types match).

if x is ""

Implies that x is of the correct value and type, but without an explicit test of type which would warn future maintainers or api users, etc.

if x == ""

Implies that x is just of the correct value

like image 3
pycruft Avatar answered Oct 17 '22 01:10

pycruft