Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I found myself swinging the list comprehension hammer

... and every for-loop looked like a list comprehension.

Instead of:

for stuff in all_stuff:
    do(stuff)

I was doing (not assigning the list to anything):

[ do(stuff) for stuff in all_stuff ]

This is a common pattern found on list-comp how-to's. 1) OK, so no big deal right? Wrong. 2) Can't this just be code style? Super wrong.

1) Yea that was wrong. As NiklasB points out, the of the HowTos is to build up a new list.

2) Maybe, but its not obvious and explicit, so better not to use it.

I didn't keep in mind that those how-to's were largely command-line based. After my team yelled at me wondering why the hell I was building up massive lists and then letting them go, it occurred to me that I might be introducing a major memory-related bug.

So here'er my question/s. If I were to do this in a very long running process, where lots of data was being consumed, would this "list" just continue consuming my memory until let go? When will the garbage collector claim the memory back? After the scope this list is built in is lost?

My guess is yes, it will keep consuming my memory. I don't know how the python garbage collector works, but I would venture to say that this list will exist until after the last next is called on all_stuff.

EDIT.

The essence of my question is relayed much cleaner in this question (thanks for the link Niklas)

like image 332
sbartell Avatar asked Mar 17 '12 00:03

sbartell


3 Answers

If I were to do this in a very long running process, where lots of data was being consumed, would this "list" just continue consuming my memory until let go?

Absolutely.

When will the garbage collector claim the memory back? After the scope this list is built in is lost?

CPython uses reference counting, so that is the most likely case. Other implementations work differently, so don't count on it.

Thanks to Karl for pointing out that due to the complex memory management mechanisms used by CPython this does not mean that the memory is immediately returned to the OS after that.

I don't know how the python garbage collector works, but I would venture to say that this list will exist until after the last next is called on all_stuff.

I don't think any garbage collector works like that. Usually they mark-and-sweep, so it could be quite some time before the list is garbage collected.

This is a common pattern found on list-comp how-to's.

Absolutely not. The point is that you iterate the list with the purpose of doing something with every item (do is called for it's side-effects). In all the examples of the List-comp HOWTO, the list is iterated to build up a new list based on the items of the old one. Let's look at an example:

# list comp, creates the list [0,1,2,3,4,5,6,7,8,9]
[i for i in range(10)]

# loop, does nothing
for i in range(10):
    i  # meh, just an expression which doesn't have an effect

Maybe you'll agree that this loop is utterly senseless, as it doesn't do anything, in contrary to the comprehension, which builds a list. In your example, it's the other way round: The comprehension is completely senseless, because you don't need the list! You can find more information about the issue on a related question

By the way, if you really want to write that loop in one line, use a generator consumer like deque.extend. This will be slightly slower than a raw for loop in this simple example, though:

>>> from collections import deque
>>> consume = deque(maxlen=0).extend
>>> consume(do(stuff) for stuff in all_stuff)
like image 198
Niklas B. Avatar answered Oct 17 '22 01:10

Niklas B.


Try manually doing GC and dumping the statistics.

gc.DEBUG_STATS

Print statistics during collection. This information can be useful when tuning the collection frequency.

FROM

http://docs.python.org/library/gc.html

like image 43
FlavorScape Avatar answered Oct 17 '22 03:10

FlavorScape


The CPython GC will reap it once there are no references to it outside of a cycle. Jython and IronPython follow the rules of the underlying GCs.

like image 2
Ignacio Vazquez-Abrams Avatar answered Oct 17 '22 01:10

Ignacio Vazquez-Abrams