Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Static memory in python: do loops create new instances of variables in memory?

Tags:

python

memory

I've been running Python scripts that make several calls to some functions, say F1(x) and F2(x), that look a bit like this:

x = LoadData()

for j in range(N):
    y = F1(x[j])
    z[j] = F2(y)

    del y

SaveData(z)

Performance is a lot faster if I keep the "del y" line. But I don't understand why this is true. If I don't use "del y", then I quickly run out of RAM and have to resort to virtual memory, and everything slows to a crawl. Buy if I use "del y", then I am repeatedly flushing and re-allocating the memory for y. What I would like to do is have y sit as static memory, and reuse the memory on every F1(x) call. But from what I can tell, that isn't what's happening.

Also, not sure if it's relevant, but my data consists of numpy arrays.

like image 775
marshall.ward Avatar asked Jul 22 '10 04:07

marshall.ward


People also ask

How variables are stored in the memory in Python?

Python stores object in heap memory and reference of object in stack. Variables, functions stored in stack and object is stored in heap.

What is static memory allocation in Python?

static int a=10; Stack Allocation. The Stack data structure is used to store the static memory. It is only needed inside the particular function or method call. The function is added in program's call stack whenever we call it.

How variables are stored in the memory?

Variables are usually stored in RAM. This is either on the heap (e.g. all global variables will usually go there) or on the stack (all variables declared within a method/function usually go there). Stack and Heap are both RAM, just different locations. Pointers have different rules.

How does Python manage its memory?

Memory management in Python involves a private heap containing all Python objects and data structures. The management of this private heap is ensured internally by the Python memory manager.


2 Answers

Without the del y you might need twice as much memory. This is because for each pass through the loop, y is bound to the previous value of F1 while the next one is calculated.

once F1 returns y is rebound to that new value and the old F1 result can be released.

This would mean that the object returned by F1 occupies quite a lot of memory

Unrolling the loop for the first couple of iterations would look like this

y = F1(x[0])   # F1(x[0]) is calculated, then y is bound to it
z[j] = F2(y)
y = F1(x[1])   # y is still bound to F1(x[0]) while F1(x[1]) is computed
               # The memory for F1(X[0]) is finally freed when y is rebound
z[j] = F2(y)

using del y is a good solution if this is what is happening in your case.

like image 188
John La Rooy Avatar answered Sep 30 '22 13:09

John La Rooy


what you actually want is something that's weird to do in python -- you want to allocate a region of memory for y and pass the pointer to that region to F1() so it can use that region to build up the next value of y. this avoid having F1() do it's own allocation for the new value of y, the reference to which is then written into your own variable y (which is actually not the value of whatever F1() calculated but a reference to it)

There's already an SO question about passing by reference in python: How do I pass a variable by reference?

like image 34
Igor Serebryany Avatar answered Sep 30 '22 13:09

Igor Serebryany