Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Baffling non-object python memory leak

I have a django webserver running under uwsgi that appears to leak memory.

Specifically, the RSS of the processes slowly grow until eventually I have to restart it.

I am aware of the other similar questions to this, however all the solutions/conclusions found so far don't appear to apply (that I can find) in this case.

So far, I have used meliae, Heapy, pympler and objgraph to inspect the python heap and they all report the same thing: A normal looking heap using about 40MB of memory (expected) with very little variance over time (as desired).

This is unfortunately entirely inconsistent with the process RSS, which will happily grow to 400MB+ with no reflection in the python heap size.

Some sample output to illustrate my point -

Pympler output comparing python heap/object memory vs process RSS:

Memory snapshot:
                                        types |   # objects |   total size
============================================= | =========== | ============
                                         dict |       20868 |     19852512
                                          str |      118598 |     11735239
                                      unicode |       19038 |     10200248
                                        tuple |       58718 |      5032528
                                         type |        1903 |      1720312
                                         code |       13225 |      1587000
                                         list |       11393 |      1289704
                            datetime.datetime |        6953 |       333744
                                          int |       12615 |       302760
  <class 'django.utils.safestring.SafeUnicode |          18 |       258844
                                      weakref |        2908 |       255904
     <class 'django.db.models.base.ModelState |        3172 |       203008
                   builtin_function_or_method |        2612 |       188064
                       function (__wrapper__) |        1469 |       176280
                                         cell |        2997 |       167832
                            getset_descriptor |        2106 |       151632
                           wrapper_descriptor |        1831 |       146480
                                          set |         226 |       143056
                                      StgDict |         217 |       138328
---------------------------
Total object memory: 56189 kB
Total process usage:
 - Peak virtual memory size: 549016 kB
 - Virtual memory size: 549012 kB
 - Locked memory size: 0 kB
 - Peak resident set size: 258876 kB
 - Resident set size: 258868 kB
 - Size of data segment: 243124 kB
 - Size of stack segment: 324 kB
 - Size of code segment: 396 kB
 - Shared library code size: 57576 kB
 - Page table entries size: 1028 kB
---------------------------

Heapy output showing a similar thing

Memory snapshot:
Partition of a set of 289509 objects. Total size = 44189136 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0 128384  44 12557528  28  12557528  28 str
     1  61545  21  5238528  12  17796056  40 tuple
     2   5947   2  3455896   8  21251952  48 unicode
     3   3618   1  3033264   7  24285216  55 dict (no owner)
     4    990   0  2570448   6  26855664  61 dict of module
     5   2165   1  1951496   4  28807160  65 type
     6  16067   6  1928040   4  30735200  70 function
     7   2163   1  1764168   4  32499368  74 dict of type
     8  14290   5  1714800   4  34214168  77 types.CodeType
     9  10294   4  1542960   3  35757128  81 list
<1046 more rows. Type e.g. '_.more' to view.>
---------------------------
Total process usage:
 - Peak virtual memory size: 503132 kB
 - Virtual memory size: 503128 kB
 - Locked memory size: 0 kB
 - Peak resident set size: 208580 kB
 - Resident set size: 208576 kB
 - Size of data segment: 192668 kB
 - Size of stack segment: 324 kB
 - Size of code segment: 396 kB
 - Shared library code size: 57740 kB
 - Page table entries size: 940 kB
---------------------------

Note that in both cases, the reported heap size is 40-50MB, whilst the process RSS is 200MB+.

I have also used objgraph's get_leaking_objects() to attempt to see if a C-extension is doing bad ref-counting, however the number of non-gc'able objects does not grow notably over time.

Does anyone have any insight as to how to go about debugging this? At this point, I am presuming one of two things is the case:

  • I have a C-extension leaking memory internally
  • uwsgi itself is leaking memory (though I can find no other evidence of this on the net)

It may be worth mentioning that I've had no success replicating this in any sort of dev environment (though it's possible I'm just not throwing enough traffic at them).

We do use a bunch of modules that have C-extensions (simplejson, hiredis, etc) so it's definitely believable that they could be the cause.

Looking for approaches to take to track this down.

like image 719
fenn Avatar asked Jan 09 '13 08:01

fenn


People also ask

How do you fix a memory leak in Python?

The use of debugging method to solve memory leaks You'll have to debug memory usage in Python using the garbage collector inbuilt module. That will provide you a list of objects known by the garbage collectors. Debugging allows you to see where much of the Python storage memory is being applied.

Can memory leaks happen on stack?

Stack memory leaks occur when a method keeps getting called but never exits. This can happen if there is an infinite loop or if the method is being called with different data each time but the data is never used. Eventually, the stack will fill up and the program will run out of memory.

How do you release memory in Python?

As a result, one may have to explicitly free up memory in Python. One way to do this is to force the Python garbage collector to release unused memory by making use of the gc module. One simply needs to run gc. collect() to do so.


1 Answers

What version of Python are you using? In Python 2.4 memory was not returned to the OS by the Python memory allocator.

Still in newer versions you can see a problem that's either related to Python's memory allocator that keeps lists of freed simple types, or in case you are running on Linux an issue intrinsic to how glibc's malloc implementation allocates memory from the OS. Take a look at http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object.htm and http://pushingtheweb.com/2010/06/python-and-tcmalloc/.

like image 176
Bernhard Avatar answered Oct 22 '22 17:10

Bernhard