This bit stung me recently. I solved it by removing all comparisons of numpy arrays with lists from the code. But why does the garbage collector miss to collect it?
Run this and watch it eat your memory:
import numpy as np
r = np.random.rand(2)
l = []
while True:
r == l
Running on 64bit Ubuntu 10.04, virtualenv 1.7.2, Python 2.7.3, Numpy 1.6.2
NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use. NumPy uses much less memory to store data and it provides a mechanism of specifying the data types. This allows the code to be optimized even further.
NumPy uses much less memory to store data The NumPy arrays takes significantly less amount of memory as compared to python lists. It also provides a mechanism of specifying the data types of the contents, which allows further optimisation of the code. If this difference seems intimidating then prepare to have more.
array('d', L) then the array will occupy less memory than the list. But not much less. the implementation of a list in python is dynamic i.e it always allocates more memory than the existing number of items in the list actually x2 times. Whereas array is just a wrapper of c and store homogeneous data.
As the array size increase, Numpy gets around 30 times faster than Python List. Because the Numpy array is densely packed in memory due to its homogeneous type, it also frees the memory faster.
Just in case someone stumbles on this and wonders...
@Dugal yes, I believe this is a memory leak in current numpy versions (Sept. 2012) that occurs when some Exceptions are raised (see this and this). Why adding the gc
call that @BiRico did "fixes" it seems weird to me, though it must be done right after appearently? Maybe its an oddity with how python garbage collects tracebacks, if someone knows the Exception handling and garbage colleciton CPython Internals, I would be interested.
Workaround: This is not directly related to lists, but for example most broadcasting Exceptions (the empty list does not fit to the arrays size, an empty array results in the same leak. Note that internally there is an Exception prepared that never surfaces). So as a workaround, you should probably just check first if the shape is correct (if you do it a lot, otherwise I wouldn't worry really, this leaks just a small string if I got it right).
FIXED: This issue will be fixed with numpy 1.7.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With