Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python string with space and without space at the end and immutability

I learnt that in some immutable classes, __new__ may return an existing instance - this is what the int, str and tuple types sometimes do for small values.

But why do the following two snippets differ in the behavior?

With a space at the end:

>>> a = 'string '
>>> b = 'string '
>>> a is b
False

Without a space:

>>> c = 'string'
>>> d = 'string'
>>> c is d
True

Why does the space bring the difference?

like image 462
James Avatar asked Jan 18 '14 11:01

James


People also ask

How do you exclude a space in a string in Python?

Python String strip() function will remove leading and trailing whitespaces. If you want to remove only leading or trailing spaces, use lstrip() or rstrip() function instead.

How do you not have a space between strings and variables in Python?

If you need to remove extra spaces when using the print() function, set the sep keyword argument to an empty string. Copied! The sep keyword argument is the separator between the values.

How do you remove spaces from a string?

strip()—Remove Leading and Trailing Spaces. The str. strip() method removes the leading and trailing whitespace from a string.

How do you remove spaces and special characters from a string in Python?

Using 'str.replace() , we can replace a specific character. If we want to remove that specific character, replace that character with an empty string. The str. replace() method will replace all occurrences of the specific character mentioned.


1 Answers

This is a quirk of how the CPython implementation chooses to cache string literals. String literals with the same contents may refer to the same string object, but they don't have to. 'string' happens to be automatically interned when 'string ' isn't because 'string' contains only characters allowed in a Python identifier. I have no idea why that's the criterion they chose, but it is. The behavior may be different in different Python versions or implementations.

From the CPython 2.7 source code, stringobject.h, line 28:

Interning strings (ob_sstate) tries to ensure that only one string object with a given value exists, so equality tests can be one pointer comparison. This is generally restricted to strings that "look like" Python identifiers, although the intern() builtin can be used to force interning of any string.

You can see the code that does this in Objects/codeobject.c:

/* Intern selected string constants */
for (i = PyTuple_Size(consts); --i >= 0; ) {
    PyObject *v = PyTuple_GetItem(consts, i);
    if (!PyString_Check(v))
        continue;
    if (!all_name_chars((unsigned char *)PyString_AS_STRING(v)))
        continue;
    PyString_InternInPlace(&PyTuple_GET_ITEM(consts, i));
}

Also, note that interning is a separate process from the merging of string literals by the Python bytecode compiler. If you let the compiler compile the a and b assignments together, e.g. by placing them in a module or an if True:, you would find that a and b would be the same string.

like image 183
user2357112 supports Monica Avatar answered Oct 14 '22 13:10

user2357112 supports Monica