I have a django webserver running under uwsgi that appears to leak memory.
Specifically, the RSS of the processes slowly grow until eventually I have to restart it.
I am aware of the other similar questions to this, however all the solutions/conclusions found so far don't appear to apply (that I can find) in this case.
So far, I have used meliae, Heapy, pympler and objgraph to inspect the python heap and they all report the same thing: A normal looking heap using about 40MB of memory (expected) with very little variance over time (as desired).
This is unfortunately entirely inconsistent with the process RSS, which will happily grow to 400MB+ with no reflection in the python heap size.
Some sample output to illustrate my point -
Pympler output comparing python heap/object memory vs process RSS:
Memory snapshot:
types | # objects | total size
============================================= | =========== | ============
dict | 20868 | 19852512
str | 118598 | 11735239
unicode | 19038 | 10200248
tuple | 58718 | 5032528
type | 1903 | 1720312
code | 13225 | 1587000
list | 11393 | 1289704
datetime.datetime | 6953 | 333744
int | 12615 | 302760
<class 'django.utils.safestring.SafeUnicode | 18 | 258844
weakref | 2908 | 255904
<class 'django.db.models.base.ModelState | 3172 | 203008
builtin_function_or_method | 2612 | 188064
function (__wrapper__) | 1469 | 176280
cell | 2997 | 167832
getset_descriptor | 2106 | 151632
wrapper_descriptor | 1831 | 146480
set | 226 | 143056
StgDict | 217 | 138328
---------------------------
Total object memory: 56189 kB
Total process usage:
- Peak virtual memory size: 549016 kB
- Virtual memory size: 549012 kB
- Locked memory size: 0 kB
- Peak resident set size: 258876 kB
- Resident set size: 258868 kB
- Size of data segment: 243124 kB
- Size of stack segment: 324 kB
- Size of code segment: 396 kB
- Shared library code size: 57576 kB
- Page table entries size: 1028 kB
---------------------------
Heapy output showing a similar thing
Memory snapshot:
Partition of a set of 289509 objects. Total size = 44189136 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 128384 44 12557528 28 12557528 28 str
1 61545 21 5238528 12 17796056 40 tuple
2 5947 2 3455896 8 21251952 48 unicode
3 3618 1 3033264 7 24285216 55 dict (no owner)
4 990 0 2570448 6 26855664 61 dict of module
5 2165 1 1951496 4 28807160 65 type
6 16067 6 1928040 4 30735200 70 function
7 2163 1 1764168 4 32499368 74 dict of type
8 14290 5 1714800 4 34214168 77 types.CodeType
9 10294 4 1542960 3 35757128 81 list
<1046 more rows. Type e.g. '_.more' to view.>
---------------------------
Total process usage:
- Peak virtual memory size: 503132 kB
- Virtual memory size: 503128 kB
- Locked memory size: 0 kB
- Peak resident set size: 208580 kB
- Resident set size: 208576 kB
- Size of data segment: 192668 kB
- Size of stack segment: 324 kB
- Size of code segment: 396 kB
- Shared library code size: 57740 kB
- Page table entries size: 940 kB
---------------------------
Note that in both cases, the reported heap size is 40-50MB, whilst the process RSS is 200MB+.
I have also used objgraph's get_leaking_objects() to attempt to see if a C-extension is doing bad ref-counting, however the number of non-gc'able objects does not grow notably over time.
Does anyone have any insight as to how to go about debugging this? At this point, I am presuming one of two things is the case:
It may be worth mentioning that I've had no success replicating this in any sort of dev environment (though it's possible I'm just not throwing enough traffic at them).
We do use a bunch of modules that have C-extensions (simplejson, hiredis, etc) so it's definitely believable that they could be the cause.
Looking for approaches to take to track this down.
The use of debugging method to solve memory leaks You'll have to debug memory usage in Python using the garbage collector inbuilt module. That will provide you a list of objects known by the garbage collectors. Debugging allows you to see where much of the Python storage memory is being applied.
Stack memory leaks occur when a method keeps getting called but never exits. This can happen if there is an infinite loop or if the method is being called with different data each time but the data is never used. Eventually, the stack will fill up and the program will run out of memory.
As a result, one may have to explicitly free up memory in Python. One way to do this is to force the Python garbage collector to release unused memory by making use of the gc module. One simply needs to run gc. collect() to do so.
What version of Python are you using? In Python 2.4 memory was not returned to the OS by the Python memory allocator.
Still in newer versions you can see a problem that's either related to Python's memory allocator that keeps lists of freed simple types, or in case you are running on Linux an issue intrinsic to how glibc's malloc implementation allocates memory from the OS. Take a look at http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object.htm and http://pushingtheweb.com/2010/06/python-and-tcmalloc/.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With