Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ways to free memory back to OS from Python?

I have code that looks similar to this:

def memoryIntensiveFunction(x):
    largeTempVariable = Intermediate(x)
    processFunction(largeTempVariable,x)

The problem is that the variable temp is something like 500 mb in a test case of mine, but that space is not returned to the OS when memoryIntensiveFunction is finished. I know this because memory profiling with the guppy tool says largeTempVariable is freed (i.e., within Python), but psutil shows it isn't. I presume I'm seeing the effects described here. The problem is that this process is long running (i.e. hours), memoryIntensiveFunction is run at the beginning and never again, so it's inconvenient for me to have to carry the 500mb around for hours.

One solution I found here and here suggests using a separate process. Multiprocessing incurs its own costs, but it would be worth it in my case. However, this would require refactoring memoryIntensiveFunction callers to receive x as a return value instead of seeing it modified in place. The real killer is that my object x is not picklable (it makes heavy use of boost python extensions). It would be a lot of work to make x picklable.

Are there any options I'm not considering?

like image 886
amos Avatar asked Nov 01 '22 21:11

amos


1 Answers

This seem curious enough that I tried to reproduce your issue, and seems that simple "del" was plenty. To demonstrate, you can run the following code:

import itertools
import pdb

def test():
    a = "a"
    for _ in itertools.repeat(None, 30):
        a += a
    pdb.set_trace()
    del a
    pdb.set_trace()

test()

And at first breakpoint you will see that it uses roughly 1gb of ram (you want the python3.3 entry):

 Private  +   Shared  =  RAM used       Program

  4.0 KiB +   9.0 KiB =  13.0 KiB       VisualGDB-DisownTTY-r1
  4.0 KiB +  15.0 KiB =  19.0 KiB       sharing-tests
  4.0 KiB +  19.5 KiB =  23.5 KiB       dhcpcd
  4.0 KiB +  31.5 KiB =  35.5 KiB       gdb
  4.0 KiB +  36.0 KiB =  40.0 KiB       vim [deleted]
  4.0 KiB +  38.0 KiB =  42.0 KiB       systemd-udevd
 40.0 KiB +  10.0 KiB =  50.0 KiB       init
 24.0 KiB + 135.0 KiB = 159.0 KiB       agetty (6)
 12.0 KiB + 150.0 KiB = 162.0 KiB       su (3)
 88.0 KiB + 103.0 KiB = 191.0 KiB       syslog-ng (2)
152.0 KiB +  55.0 KiB = 207.0 KiB       crond
172.0 KiB +  81.0 KiB = 253.0 KiB       python3.4
580.0 KiB + 220.5 KiB = 800.5 KiB       sshd (3)
768.0 KiB + 932.0 KiB =   1.7 MiB       bash (13)
  2.8 MiB + 118.0 KiB =   2.9 MiB       mongod
  7.4 MiB + 109.0 KiB =   7.5 MiB       tmux [deleted] (2)
  1.0 GiB +   1.2 MiB =   1.0 GiB       python3.3
---------------------------------
                          1.0 GiB
=================================

And then at second breakpoint, after we del the variable the memory is freed:

 Private  +   Shared  =  RAM used       Program

  4.0 KiB +   9.0 KiB =  13.0 KiB       VisualGDB-DisownTTY-r1
  4.0 KiB +  15.0 KiB =  19.0 KiB       sharing-tests
  4.0 KiB +  19.5 KiB =  23.5 KiB       dhcpcd
  4.0 KiB +  31.5 KiB =  35.5 KiB       gdb
  4.0 KiB +  36.0 KiB =  40.0 KiB       vim [deleted]
  4.0 KiB +  38.0 KiB =  42.0 KiB       systemd-udevd
 40.0 KiB +  10.0 KiB =  50.0 KiB       init
 24.0 KiB + 135.0 KiB = 159.0 KiB       agetty (6)
 12.0 KiB + 150.0 KiB = 162.0 KiB       su (3)
 88.0 KiB + 103.0 KiB = 191.0 KiB       syslog-ng (2)
152.0 KiB +  55.0 KiB = 207.0 KiB       crond
172.0 KiB +  81.0 KiB = 253.0 KiB       python3.4
584.0 KiB + 220.5 KiB = 804.5 KiB       sshd (3)
768.0 KiB + 928.0 KiB =   1.7 MiB       bash (13)
  2.8 MiB + 118.0 KiB =   2.9 MiB       mongod
  5.1 MiB +   1.2 MiB =   6.3 MiB       python3.3
  7.4 MiB + 109.0 KiB =   7.5 MiB       tmux [deleted] (2)
---------------------------------
                         20.3 MiB
=================================

Now if we drop the "del" from function, and set a breakpoint right after test():

import itertools
import pdb

def test():
    a = "a"
    for _ in itertools.repeat(None, 30):
        a += a
    pdb.set_trace()

test()
pdb.set_trace()

The memory indeed won't be freed before we terminate:

 Private  +   Shared  =  RAM used       Program

  4.0 KiB +   9.0 KiB =  13.0 KiB       VisualGDB-DisownTTY-r1
  4.0 KiB +  15.0 KiB =  19.0 KiB       sharing-tests
  4.0 KiB +  19.5 KiB =  23.5 KiB       dhcpcd
  4.0 KiB +  31.5 KiB =  35.5 KiB       gdb
  4.0 KiB +  36.0 KiB =  40.0 KiB       vim [deleted]
  4.0 KiB +  38.0 KiB =  42.0 KiB       systemd-udevd
 40.0 KiB +  10.0 KiB =  50.0 KiB       init
 24.0 KiB + 135.0 KiB = 159.0 KiB       agetty (6)
 12.0 KiB + 150.0 KiB = 162.0 KiB       su (3)
160.0 KiB +  53.0 KiB = 213.0 KiB       crond
172.0 KiB +  81.0 KiB = 253.0 KiB       python3.4
628.0 KiB + 219.5 KiB = 847.5 KiB       sshd (3)
836.0 KiB + 152.0 KiB = 988.0 KiB       syslog-ng (2)
752.0 KiB + 957.0 KiB =   1.7 MiB       bash (13)
  2.8 MiB + 113.0 KiB =   2.9 MiB       mongod
  7.4 MiB + 108.0 KiB =   7.6 MiB       tmux [deleted] (2)
  1.0 GiB +   1.1 MiB =   1.0 GiB       python3.3
---------------------------------
                          1.0 GiB
=================================

So my suggestion? Just delete the sucker after you've used it, and do not need it any more ;)

like image 86
Tymoteusz Paul Avatar answered Nov 15 '22 05:11

Tymoteusz Paul