Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Force garbage collection in Python to free memory

I have a Python2.7 App which used lots of dict objects which mostly contain strings for keys and values.

Sometimes those dicts and strings are not needed anymore and I would like to remove those from memory.

I tried different things, del dict[key], del dict, etc. But the App still uses the same amount of memory.

Below a example which I would expect to fee the memory. But it doesn't :(

import gc
import resource

def mem():
    print('Memory usage         : % 2.2f MB' % round(
        resource.getrusage(resource.RUSAGE_SELF).ru_maxrss/1024.0/1024.0,1)
    )

mem()

print('...creating list of dicts...')
n = 10000
l = []
for i in xrange(n):
    a = 1000*'a'
    b = 1000*'b'
    l.append({ 'a' : a, 'b' : b })

mem()

print('...deleting list items...')

for i in xrange(n):
    l.pop(0)

mem()

print('GC collected objects : %d' % gc.collect())

mem()

Output:

Memory usage         :  4.30 MB
...creating list of dicts...
Memory usage         :  36.70 MB
...deleting list items...
Memory usage         :  36.70 MB
GC collected objects : 0
Memory usage         :  36.70 MB

I would expect here some objects to be 'collected' and some memory to be freed.

Am I doing something wrong? Any other ways to delete unused objects or a least to find where the objects are unexpectedly used.

like image 916
ddofborg Avatar asked Aug 23 '15 13:08

ddofborg


People also ask

How do I force a memory to release in Python?

to clear Memory in Python just use del. By using del you can clear the memory which is you are not wanting. By using del you can clear variables, arrays, lists etc.

How do I free allocate memory in Python?

Clear Memory in Python Using the del Statement Along with the gc. collect() method, the del statement can be quite useful to clear memory during Python's program execution. The del statement is used to delete the variable in Python.

Does Python automatically clear memory?

Unlike many other languages, Python does not necessarily release the memory back to the Operating System. Instead, it has a dedicated object allocator for objects smaller than 512 bytes, which keeps some chunks of already allocated memory for further use in the future.

Is it possible to use garbage collection for handling memory in Python?

Python garbage collection algorithm is very useful to open up space in the memory. Garbage collection is implemented in Python in two ways: reference counting and generational. When the reference count of an object reaches 0, reference counting garbage collection algorithm cleans up the object immediately.


2 Answers

Frederick Lundh explains,

If you create a large object and delete it again, Python has probably released the memory, but the memory allocators involved don’t necessarily return the memory to the operating system, so it may look as if the Python process uses a lot more virtual memory than it actually uses.

and Alex Martelli writes:

The only really reliable way to ensure that a large but temporary use of memory DOES return all resources to the system when it's done, is to have that use happen in a subprocess, which does the memory-hungry work then terminates.

So, you could use multiprocessing to spawn a subprocess, perform the memory-hogging calculation, and then ensure the memory is released when the subprocess terminates:

import multiprocessing as mp
import resource

def mem():
    print('Memory usage         : % 2.2f MB' % round(
        resource.getrusage(resource.RUSAGE_SELF).ru_maxrss/1024.0,1)
    )

mem()

def memoryhog():
    print('...creating list of dicts...')
    n = 10**5
    l = []
    for i in xrange(n):
        a = 1000*'a'
        b = 1000*'b'
        l.append({ 'a' : a, 'b' : b })
    mem()

proc = mp.Process(target=memoryhog)
proc.start()
proc.join()

mem()

yields

Memory usage         :  5.80 MB
...creating list of dicts...
Memory usage         :  234.20 MB
Memory usage         :  5.90 MB
like image 182
unutbu Avatar answered Oct 13 '22 18:10

unutbu


That might be somewhat useful, using multiprocessing and a library called Ray which uses shared memory to perform multi-gb data sharing between processes. This way is easy to spawn a secondary process and still access the same objects quick and easy from the parent process.

like image 27
Corneliu Maftuleac Avatar answered Oct 13 '22 18:10

Corneliu Maftuleac