Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Since Tuples are immutable, why does slicing them make a copy instead of a view?

Tags:

python

As far as I understand it, tuples and strings are immutable to allow optimizations such as re-using memory that won't change. However, one obvious optimisation, making slices of tuples refer to the same memory as the original tuple, is not included in python.

I know that this optimization isn't included because when I time the following function, time taken goes like O(n^2) instead of O(n), so full copying is taking place:

def test(n):
    tup = tuple(range(n))
    for i in xrange(n):
        tup[0:i]

Is there some behavior of python that would change if this optimization was implemented? Is there some performance benefit to copying even when the original is immutable?

like image 518
QuadmasterXLII Avatar asked Jan 10 '16 20:01

QuadmasterXLII


People also ask

Does slicing create a copy?

Slicing lists does not generate copies of the objects in the list; it just copies the references to them. That is the answer to the question as asked.

Does slicing work on tuples?

We can use slicing in tuples I'm the same way as we use in strings and lists. Tuple slicing is basically used to obtain a range of items. Furthermore, we perform tuple slicing using the slicing operator. We can represent the slicing operator in the syntax [start:stop:step].

Can a tuple be sliced?

Slicing. We can access a range of items in a tuple by using the slicing operator colon : . Slicing can be best visualized by considering the index to be between the elements as shown below. So if we want to access a range, we need the index that will slice the portion from the tuple.

Does slicing create a new list Python?

In short, slicing is a flexible tool to build new lists out of an existing list. Python supports slice notation for any sequential data type like lists, strings, tuples, bytes, bytearrays, and ranges. Also, any new data structure can add its support as well.


1 Answers

By view, are you thinking of something equivalent to what numpy does? I'm familiar with how and why numpy does that.

A numpy array is an object with shape and dtype information, plus a data buffer. You can see this information in the __array_interface__ property. A view is a new numpy object, with its own shape attribute, but with a new data buffer pointer that points to someplace in the source buffer. It also has a flag that says "I don't own the buffer". numpy also maintains its own reference count, so the data buffer is not destroyed if the original (owner) array is deleted (and garbage collected).

This use of views can be big time saver, especially with very large arrays (questions about memory errors are common on SO). Views also allow different dtype, so a data buffer can be viewed at 4 byte integers, or 1 bytes characters, etc.

How would this apply to tuples? My guess is that it would require a lot of extra baggage. A tuple consists of a fixed set of object pointers - probably a C array. A view would use the same array, but with its own start and end markers (pointers and/or lengths). What about sharing flags? Garbage collection?

And what's the typical size and use of tuples? A common use of tuples is to pass arguments to a function. My guess is that a majority of tuples in a typical Python run are small - 0, 1 or 2 elements. Slices are allowed, but are they very common? On small tuples or very large ones?

Would there be any unintended consequences to making tuple slices views (in the numpy sense)? The distinction between views and copies is one of the harder things for numpy users to grasp. Since a tuple is supposed to be immutable - that is the pointers in the tuple cannot be changed - it is possible that implementing views would be invisible to users. But still I wonder.

It may make most sense to try this idea on a branch of the PyPy version - unless you really like to get dig into Cpython code. Or as a custom class with Cython.

like image 57
hpaulj Avatar answered Sep 25 '22 09:09

hpaulj