Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python hangs for hours on end of functions after creating huge object

Tags:

python

memory

I have a function that generates a huge object (about 100-150Gb of memory, on a machine having 500Gb memory).

The function runs in about 1h, and writes a file to disk (about 100Mb).

But when the function ends, the program hangs there for several hours without doing anything (it doesn't continue instructions after the place where the function was called).

I suspect the garbage collector trying to delete the huge object created in this function, but I don't see anything happening (strace prints nothing), and the memory is not decreasing.

Do you have any idea of why this is happening and how to solve it ? I'm using python 3.5

like image 988
cdancette Avatar asked Jan 25 '18 14:01

cdancette


People also ask

How do I increase the memory limit in Python?

Python doesn't limit memory usage on your program. It will allocate as much memory as your program needs until your computer is out of memory. The most you can do is reduce the limit to a fixed upper cap. That can be done with the resource module, but it isn't what you're looking for.

How do you release memory in Python?

As a result, one may have to explicitly free up memory in Python. One way to do this is to force the Python garbage collector to release unused memory by making use of the gc module. One simply needs to run gc. collect() to do so.

Does Python automatically free memory?

The programmer has to manually allocate memory before it can be used by the program and release it when the program no longer needs it. In Python, memory management is automatic! Python automatically handles the allocation and deallocation of memory.


1 Answers

Certainly not an answer, but here is a thread from the Python Developers mailing list that describes some behavior that sounds like what you are experiencing (I have experienced it too). https://mail.python.org/pipermail/python-dev/2008-December/084450.html

Having dug through the thread a bit, some interesting things have popped out:

  • Many say blame this on swap being so slow, but the OP (of the thread) and my experience show that this is not the case.
  • Others blame it on garbage collection, which I think is part of the culprit. It seems that there is some implementation detail that involved freeing non-contiguous blocks of memory.
    • An example in the thread of this is garbage collecting a sorted list taking no time at all (1-2 seconds), but then when that same list is shuffled, taking an exorbitant amount of time.

One possible workaround is by deleting the dictionary while still keeping a reference to the objects that are in the dictionary. It is presented in this message (very near the end of the thread). https://mail.python.org/pipermail/python-dev/2008-December/084560.html

Unfortunately, from the thread I haven't been able to see a clear solution to it, but hopefully this helps shed some light on what is going on!

like image 162
colelemonz Avatar answered Sep 27 '22 22:09

colelemonz