Identity quirk with string split()

Question

>>> 'hi'.split()[0] is 'hi'
    True    
>>> 'hi there'.split()[0] is 'hi'
    False
>>> 'hi there again'.split()[0] is 'hi'
    False

My hypothesis:

The first line has only one element in split, while the other two have more than one element. I believe that while Python primitives like str are stored in memory by value within a function, there will be separate allocations across functions to simplify memory management. I think split() is one of those functions, and it usually allocates new strings. But it also handles the edge case of input that does not need any splitting (such as 'hi'), where the original string reference is simply returned. Is my explanation correct?

user2357112 supports Monica · Accepted Answer

I believe that while Python primitives like str are stored in memory by value within a function, there will be separate allocations across functions to simplify memory management.

Python's object allocation doesn't work anything like that. There isn't a real concept of "primitives", and aside from a few things the bytecode compiler does to merge constants, it doesn't matter whether two objects are created in the same function or different functions.

There isn't really a better answer to this than to point to the source, so here it is:

Py_LOCAL_INLINE(PyObject *)
STRINGLIB(split_whitespace)(PyObject* str_obj,
                           const STRINGLIB_CHAR* str, Py_ssize_t str_len,
                           Py_ssize_t maxcount)
{
    ...
#ifndef STRINGLIB_MUTABLE
        if (j == 0 && i == str_len && STRINGLIB_CHECK_EXACT(str_obj)) {
            /* No whitespace in str_obj, so just use it as list[0] */
            Py_INCREF(str_obj);
            PyList_SET_ITEM(list, 0, (PyObject *)str_obj);
            count++;
            break;
        }

If it doesn't find any whitespace to split on, it just reuses the original string object in the returned list. It's just a quirk of how this function was written, and you can't depend on it working that way in other Python versions or nonstandard Python implementations.

Identity quirk with string split()

Tags:

python

string

split

primitive

onepiece

1 Answers

user2357112 supports Monica

Recent Activity

Donate For Us

Identity quirk with string split()

Tags:

python

string

split

primitive

onepiece

1 Answers

user2357112 supports Monica

Related questions

Recent Activity

Donate For Us