I'm reading in a collection of objects (tables like sqlite3 tables or dataframes) from an Object Oriented DataBase, most of which are small enough that the Python garbage collector can handle without incident. However, when they get larger in size (less than 10 MB's) the GC doesn't seem to be able to keep up.
psuedocode looks like this:
walk = walkgenerator('/path')
objs = objgenerator(walk)
with db.transaction(bundle=True, maxSize=10000, maxParts=10):
oldobj = None
oldtable = None
for obj in objs:
currenttable = obj.table
if oldtable and oldtable in currenttable:
db.delete(oldobj.path)
del oldtable
oldtable = currenttable
del oldobj
oldobj = obj
if not count % 100:
gc.collect()
I'm looking for an elegant way to manage memory while allowing Python to handle it when possible.
Perhaps embarrassingly, I've tried using del to help clean up reference counts.
I've tried gc.collect() at varying modulo counts in my for loops:
Suggestions are appreciated!!!
Particularly, if you can give me tools to assist with introspection. I've used Windows Task Manager here, and it seems to more or less randomly spring a memory leak. I've limited the transaction size as much as I feel comfortable, and that seems to help a little bit.
No, they are in a different memory called “Heap Memory” (also called the Heap). To store objects, we need memory with dynamic memory allocation (i.e., size of memory and objects can change). Python interpreter actively allocates and deallocates the memory on the Heap (what C/C++ programmers should do manually!!!
The process by which Python periodically frees and reclaims blocks of memory that no longer are in use is called Garbage Collection. Python's garbage collector runs during program execution and is triggered when an object's reference count reaches zero.
Python's memory allocation and deallocation method is automatic. The user does not have to preallocate or deallocate memory by hand as one has to when using dynamic memory allocation in languages such as C or C++. Python uses two strategies for memory allocation reference counting and garbage collection.
Python has an automated garbage collection. It has an algorithm to deallocate objects which are no longer needed. Python has two ways to delete the unused objects from the memory.
There's not enough info here to say much, but what I do have to say wouldn't fit in a comment so I'll post it here ;-)
First, and most importantly, in CPython garbage collection is mostly based on reference counting. gc.collect()
won't do anything for you (except burn time) unless trash objects are involved in reference cycles (an object A
can be reached from itself by following a chain of pointers transitively reachable from A
). You create no reference cycles in the code you showed, but perhaps the database layer does.
So, after you run gc.collect()
, does memory use go down at all? If not, running it is pointless.
I expect it's most likely that the database layer is holding references to objects longer than necessary, but digging into that requires digging into exact details of how the database layer is implemented.
One way to get clues is to print the result of sys.getrefcount()
applied to various large objects:
>>> import sys
>>> bigobj = [1] * 1000000
>>> sys.getrefcount(bigobj)
2
As the docs say, the result is generally 1 larger than you might hope, because the refcount of getrefcount()
's argument is temporarily incremented by 1 simply because it is being used (temporarily) as an argument.
So if you see a refcount greater than 2, del
won't free the object.
Another way to get clues is to pass the object to gc.get_referrers()
. That returns a list of objects that directly refer to the argument (provided that a referrer participates in Python's cyclic gc).
BTW, you need to be clearer about what you mean by "doesn't seem to work" and "blows up eventually". Can't guess. What exactly goes wrong? For example, is MemoryError
raised? Something else? Traebacks often yield a world of useful clues.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With