Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Forcing memory to release after running a function

I use a module (that I cannot modify) which contains a method that I need to use. This method returns 10GB of data, but also allocates 8GB of memory that it does not release. I need to use this method at the start of a script that runs for a long time, and I want to make sure the 8GB of memory are released after I run the method. What are my options here?

To be clear, the 8GB do not get reused by the script - i.e. if I create a large numpy array after running the method, extra memory is allocated for that numpy array.

I have considered running the method in a separate process using the multiprocessing module (and returning the result), but run into problems serializing the large result of the method - 10GB cannot be pickled by the default pickler, and even if I force multiprocessing to use pickle version 4 pickling has a very large memory overhead. Is there anything else I could do without being able to modify the offending module?

Edit: here is an example

from dataloader import dataloader1
result = dataloader1.get("DATA1")

As I understand it, dataloader is a Python wrapper around some C++ code using pybind11. I do not know much more about its internal workings. The code above results in 18GB being used. If I then run

del result

10GB gets freed up correctly, but 8GB continues being used (with seemingly no python objects existing any more).

Edit2: If I create a smallish numpy array (e.g. 3GB), memory usage stays at 8GB. If I delete it and instead create a 6GB numpy array, memory usage goes to 14GB and comes back down to 8GB after I delete it. I still need the 8GB released to the OS.

like image 354
rinspy Avatar asked Sep 15 '25 11:09

rinspy


1 Answers

can you modify the function? If the memory is held by some module, try to reload that module, (importlib.reload) which should release the memory.

like image 141
Christian Sauer Avatar answered Sep 18 '25 00:09

Christian Sauer