<p>I have the following problem. </p> <blockquote> <p>Given a list of integers <code>L</code>, I need to generate all of the sublists <code>L[k:]</code> <code>for k in [0, len(L) - 1]</code>, <strong>without generating copies</strong>. </p> </blockquote> <p>How do I accomplish this in Python? With a buffer object somehow?</p>

<h3>The short answer</h3> <p>Slicing lists does not generate copies of the objects in the list; it just copies the references to them. That is the answer to the question as asked.</p> <h3>The long answer</h3> <h3>Testing on mutable and immutable values</h3> <p>First, let's test the basic claim. We can show that even in the case of immutable objects like integers, only the reference is copied. Here are three different integer objects, each with the same value:</p> <pre class="prettyprint"><code>>>> a = [1000 + 1, 1000 + 1, 1000 + 1] </code></pre> <p>They have the same value, but you can see they are three distinct objects because they have different <code>id</code>s:</p> <pre class="prettyprint"><code>>>> map(id, a) [140502922988976, 140502922988952, 140502922988928] </code></pre> <p>When you slice them, the references remain the same. No new objects have been created:</p> <pre class="prettyprint"><code>>>> b = a[1:3] >>> map(id, b) [140502922988952, 140502922988928] </code></pre> <p>Using different objects with the same value shows that the copy process doesn't bother with interning -- it just directly copies the references.</p> <p>Testing with mutable values gives the same result:</p> <pre class="prettyprint"><code>>>> a = [{0: 'zero', 1: 'one'}, ['foo', 'bar']] >>> map(id, a) [4380777000, 4380712040] >>> map(id, a[1:] ... ) [4380712040] </code></pre> <h3>Examining remaining memory overhead</h3> <p>Of course the references <em>themselves</em> are copied. Each one costs 8 bytes on a 64-bit machine. And each list has its own memory overhead of 72 bytes:</p> <pre class="prettyprint"><code>>>> for i in range(len(a)): ... x = a[:i] ... print('len: {}'.format(len(x))) ... print('size: {}'.format(sys.getsizeof(x))) ... len: 0 size: 72 len: 1 size: 80 len: 2 size: 88 </code></pre> <p>As Joe Pinsonault reminds us, that overhead adds up. And integer objects themselves are not very large -- they are three times larger than references. So this saves you some memory in an absolute sense, but asymptotically, it might be nice to be able to have multiple lists that are "views" into the same memory.</p> <h3>Saving memory by using views</h3> <p>Unfortunately, Python provides no easy way to produce objects that are "views" into lists. Or perhaps I should say "fortunately"! It means you don't have to worry about where a slice comes from; changes to the original won't affect the slice. Overall, that makes reasoning about a program's behavior much easier. </p> <p>If you really want to save memory by working with views, consider using <code>numpy</code> arrays. When you slice a <code>numpy</code> array, the memory is shared between the slice and the original: </p> <pre class="prettyprint"><code>>>> a = numpy.arange(3) >>> a array([0, 1, 2]) >>> b = a[1:3] >>> b array([1, 2]) </code></pre> <p>What happens when we modify <code>a</code> and look again at <code>b</code>?</p> <pre class="prettyprint"><code>>>> a[2] = 1001 >>> b array([ 1, 1001]) </code></pre> <p>But this means you have to be sure that when you modify one object, you aren't inadvertently modifying another. That's the trade-off when you use <code>numpy</code>: less work for the computer, and more work for the programmer!</p>

Slicing a list in Python without generating a copy

2 Answers

The short answer

Slicing lists does not generate copies of the objects in the list; it just copies the references to them. That is the answer to the question as asked.

The long answer

Testing on mutable and immutable values

First, let's test the basic claim. We can show that even in the case of immutable objects like integers, only the reference is copied. Here are three different integer objects, each with the same value:

>>> a = [1000 + 1, 1000 + 1, 1000 + 1]

They have the same value, but you can see they are three distinct objects because they have different ids:

>>> map(id, a) [140502922988976, 140502922988952, 140502922988928]

When you slice them, the references remain the same. No new objects have been created:

>>> b = a[1:3] >>> map(id, b) [140502922988952, 140502922988928]

Using different objects with the same value shows that the copy process doesn't bother with interning -- it just directly copies the references.

Testing with mutable values gives the same result:

>>> a = [{0: 'zero', 1: 'one'}, ['foo', 'bar']] >>> map(id, a) [4380777000, 4380712040] >>> map(id, a[1:] ... ) [4380712040]

Examining remaining memory overhead

Of course the references themselves are copied. Each one costs 8 bytes on a 64-bit machine. And each list has its own memory overhead of 72 bytes:

>>> for i in range(len(a)): ...     x = a[:i] ...     print('len: {}'.format(len(x))) ...     print('size: {}'.format(sys.getsizeof(x))) ...  len: 0 size: 72 len: 1 size: 80 len: 2 size: 88

As Joe Pinsonault reminds us, that overhead adds up. And integer objects themselves are not very large -- they are three times larger than references. So this saves you some memory in an absolute sense, but asymptotically, it might be nice to be able to have multiple lists that are "views" into the same memory.

Saving memory by using views

Unfortunately, Python provides no easy way to produce objects that are "views" into lists. Or perhaps I should say "fortunately"! It means you don't have to worry about where a slice comes from; changes to the original won't affect the slice. Overall, that makes reasoning about a program's behavior much easier.

If you really want to save memory by working with views, consider using numpy arrays. When you slice a numpy array, the memory is shared between the slice and the original:

>>> a = numpy.arange(3) >>> a array([0, 1, 2]) >>> b = a[1:3] >>> b array([1, 2])

What happens when we modify a and look again at b?

>>> a[2] = 1001 >>> b array([   1, 1001])

But this means you have to be sure that when you modify one object, you aren't inadvertently modifying another. That's the trade-off when you use numpy: less work for the computer, and more work for the programmer!

answered Oct 11 '22 07:10

senderle

Depending on what you're doing, you might be able to use islice.

Since it operates via iteration, it won't make new lists, but instead will simply create iterators that yield elements from the original list as requested for their ranges.

answered Oct 11 '22 07:10

Amber

Related questions
                            
                                Python for-in loop preceded by a variable
                            
                                Is there a way to guarantee hierarchical output from NetworkX?
                            
                                What kinds of patterns could I enforce on the code to make it easier to translate to another programming language? [closed]
                            
                                Python way of printing: with 'format' or percent form? [duplicate]
                            
                                Can I patch a Python decorator before it wraps a function?
                            
                                How do I choose between Tesseract and OpenCV? [closed]
                            
                                WSGI vs uWSGi with Nginx [closed]
                            
                                Python handling socket.error: [Errno 104] Connection reset by peer
                            
                                double equals vs is in python [duplicate]
                            
                                Import file from parent directory?
                            
                                Overloaded functions in Python
                            
                                Get matplotlib color cycle state
                            
                                Count number of non-NaN entries in every column of Dataframe
                            
                                'pip install' fails for every package ("Could not find a version that satisfies the requirement") [duplicate]
                            
                                Function chaining in Python
                            
                                Inheritance of private and protected methods in Python
                            
                                ERROR: Could not build wheels for scipy which use PEP 517 and cannot be installed directly
                            
                                How do I autoformat some Python code to be correctly formatted?
                            
                                Pandas sum by groupby, but exclude certain columns
                            
                                Flask Value error view function did not return a response [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Slicing a list in Python without generating a copy

Tags:

python

list

slice

Chris

People also ask