Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python strings references [duplicate]

Possible Duplicate:
Python '==' vs 'is' comparing strings, 'is' fails sometimes, why?

Hi. I have a question about how Python works when it comes how and when references are used.

I have an example here that I understand.

a = "cat"
b = a
a is b
   True

This makes sense. But here comes something I don't understand.

a = "cat"
b = "cat"
a is b
   True
c = 1.2
d = 1.2
c is d
    False
e = "cat"
f = "".join(a)
e is f
    False

Why does a is b return True and not c is d? Both types are immutable right? And It worked when using float numbers I can only imagine it to be some kind of optimization, but I am happy for any answer.

I also tried some other things and got this result:

a = "cat"
b = "c"
c = b+"at"
a is c
    False # Why not same as setting c = "cat"
d = "cat"+""
a is d
    True # Probably same as setting d = "cat"
e = "c"+"at"
a is e
    True # Probably same as setting e = "cat"

I guess this is the same problem here, but why does it not give the True when the variable b is used to create "cat"?

I use python 2.5 if that would make any differance

Any tips and ideas useful here are appreciated.

like image 732
Henke Avatar asked Jan 13 '11 11:01

Henke


People also ask

Are strings copied by reference Python?

The pythonic way of dealing with this is to hold strings in an list and join them together once you have all the pieces. Python never passes by reference. It passes references, but "pass by reference" is already taken for another argument passing style (one which permits, for example, the function swap(a, b) ).

Are strings deep copied in Python?

Before we move on to the different methods to copy a string in Python, we should keep in mind that a string cannot be directly copied. In Python, strings are immutable, meaning that their value cannot change over the course of the program. Being immutable also means that a string cannot directly have a copy.

Does Python string slice make a copy?

Python does slice-by-copy, meaning every time you slice (except for very trivial slices, such as a[:] ), it copies all of the data into a new string object. The [slice-by-reference] approach is more complicated, harder to implement and may lead to unexpected behavior.


1 Answers

a = "cat"
b = "cat"
a is b
   True
c = 1.2
d = 1.2
c is d
    False

Why does a is b return True and not c is d?

Well, the correct question would be "why does c is d return False and not a is b?" since the logical expected behavior would be to return False - they are separate objects, created in separate places.

Thing is, the current implementation of python written in C, Cpython, uses string and small int caching as a means of optimization. The logic behind that optimization is that, since they're immutable anyway, it shouldn't matter. But you shouldn't rely on that behavior because it is implementation-specific and not part of the language. Always use == to compare strings, not is. == also has an optimization for immutable builtin types, where it checks identity first, and only if they aren't the same object, proceeds with the equality test. So it shouldn't matter performance-wise.

Looks like you've already found why you can't rely on it on Cpython itself, on the rest of your question.

like image 99
nosklo Avatar answered Sep 19 '22 13:09

nosklo