I've been running Python scripts that make several calls to some functions, say F1(x) and F2(x), that look a bit like this:
x = LoadData()
for j in range(N):
y = F1(x[j])
z[j] = F2(y)
del y
SaveData(z)
Performance is a lot faster if I keep the "del y" line. But I don't understand why this is true. If I don't use "del y", then I quickly run out of RAM and have to resort to virtual memory, and everything slows to a crawl. Buy if I use "del y", then I am repeatedly flushing and re-allocating the memory for y. What I would like to do is have y sit as static memory, and reuse the memory on every F1(x) call. But from what I can tell, that isn't what's happening.
Also, not sure if it's relevant, but my data consists of numpy arrays.
Python stores object in heap memory and reference of object in stack. Variables, functions stored in stack and object is stored in heap.
static int a=10; Stack Allocation. The Stack data structure is used to store the static memory. It is only needed inside the particular function or method call. The function is added in program's call stack whenever we call it.
Variables are usually stored in RAM. This is either on the heap (e.g. all global variables will usually go there) or on the stack (all variables declared within a method/function usually go there). Stack and Heap are both RAM, just different locations. Pointers have different rules.
Memory management in Python involves a private heap containing all Python objects and data structures. The management of this private heap is ensured internally by the Python memory manager.
Without the del y
you might need twice as much memory. This is because for each pass through the loop, y
is bound to the previous value of F1
while the next one is calculated.
once F1
returns y is rebound to that new value and the old F1
result can be released.
This would mean that the object returned by F1
occupies quite a lot of memory
Unrolling the loop for the first couple of iterations would look like this
y = F1(x[0]) # F1(x[0]) is calculated, then y is bound to it
z[j] = F2(y)
y = F1(x[1]) # y is still bound to F1(x[0]) while F1(x[1]) is computed
# The memory for F1(X[0]) is finally freed when y is rebound
z[j] = F2(y)
using del y
is a good solution if this is what is happening in your case.
what you actually want is something that's weird to do in python -- you want to allocate a region of memory for y
and pass the pointer to that region to F1()
so it can use that region to build up the next value of y
. this avoid having F1()
do it's own allocation for the new value of y
, the reference to which is then written into your own variable y
(which is actually not the value of whatever F1()
calculated but a reference to it)
There's already an SO question about passing by reference in python: How do I pass a variable by reference?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With