... and every for-loop looked like a list comprehension.
Instead of:
for stuff in all_stuff:
do(stuff)
I was doing (not assigning the list to anything):
[ do(stuff) for stuff in all_stuff ]
This is a common pattern found on list-comp how-to's. 1) OK, so no big deal right? Wrong. 2) Can't this just be code style? Super wrong.
1) Yea that was wrong. As NiklasB points out, the of the HowTos is to build up a new list.
2) Maybe, but its not obvious and explicit, so better not to use it.
I didn't keep in mind that those how-to's were largely command-line based. After my team yelled at me wondering why the hell I was building up massive lists and then letting them go, it occurred to me that I might be introducing a major memory-related bug.
So here'er my question/s. If I were to do this in a very long running process, where lots of data was being consumed, would this "list" just continue consuming my memory until let go? When will the garbage collector claim the memory back? After the scope this list is built in is lost?
My guess is yes, it will keep consuming my memory. I don't know how the python garbage collector works, but I would venture to say that this list will exist until after the last next
is called on all_stuff
.
EDIT.
The essence of my question is relayed much cleaner in this question (thanks for the link Niklas)
If I were to do this in a very long running process, where lots of data was being consumed, would this "list" just continue consuming my memory until let go?
Absolutely.
When will the garbage collector claim the memory back? After the scope this list is built in is lost?
CPython uses reference counting, so that is the most likely case. Other implementations work differently, so don't count on it.
Thanks to Karl for pointing out that due to the complex memory management mechanisms used by CPython this does not mean that the memory is immediately returned to the OS after that.
I don't know how the python garbage collector works, but I would venture to say that this list will exist until after the last next is called on all_stuff.
I don't think any garbage collector works like that. Usually they mark-and-sweep, so it could be quite some time before the list is garbage collected.
This is a common pattern found on list-comp how-to's.
Absolutely not. The point is that you iterate the list with the purpose of doing something with every item (do
is called for it's side-effects). In all the examples of the List-comp HOWTO, the list is iterated to build up a new list based on the items of the old one. Let's look at an example:
# list comp, creates the list [0,1,2,3,4,5,6,7,8,9]
[i for i in range(10)]
# loop, does nothing
for i in range(10):
i # meh, just an expression which doesn't have an effect
Maybe you'll agree that this loop is utterly senseless, as it doesn't do anything, in contrary to the comprehension, which builds a list. In your example, it's the other way round: The comprehension is completely senseless, because you don't need the list! You can find more information about the issue on a related question
By the way, if you really want to write that loop in one line, use a generator consumer like deque.extend
. This will be slightly slower than a raw for
loop in this simple example, though:
>>> from collections import deque
>>> consume = deque(maxlen=0).extend
>>> consume(do(stuff) for stuff in all_stuff)
Try manually doing GC and dumping the statistics.
gc.DEBUG_STATS
Print statistics during collection. This information can be useful when tuning the collection frequency.
FROM
http://docs.python.org/library/gc.html
The CPython GC will reap it once there are no references to it outside of a cycle. Jython and IronPython follow the rules of the underlying GCs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With