Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python monotonically increasing memory usage (leak?)

I'm using this simple code and observing monotonically increasing memory usage. I'm using this little module to dump stuff to disk. I observed it happens with unicode strings and not with integers, is there something I'm doing wrong?

When I do:

>>> from utils.diskfifo import DiskFifo
>>> df=DiskFifo()
>>> for i in xrange(1000000000):
...     df.append(i)

Memory consumption is stable

but when I do:

>>> while True:
...     a={'key': u'value', 'key2': u'value2'}
...     df.append(a)

It goes to the roof. Any hints? below the module...


import tempfile
import cPickle

class DiskFifo:
    def __init__(self):
        self.fd = tempfile.TemporaryFile()
        self.wpos = 0
        self.rpos = 0
        self.pickler = cPickle.Pickler(self.fd)
        self.unpickler = cPickle.Unpickler(self.fd)
        self.size = 0

    def __len__(self):
        return self.size

    def extend(self, sequence):
        map(self.append, sequence)

    def append(self, x):
        self.fd.seek(self.wpos)
        self.pickler.dump(x)
        self.wpos = self.fd.tell()
        self.size = self.size + 1

    def next(self):
        try:
            self.fd.seek(self.rpos)
            x = self.unpickler.load()
            self.rpos = self.fd.tell()
            return x

        except EOFError:
            raise StopIteration

    def __iter__(self):
        self.rpos = 0
        return self
like image 542
piotr Avatar asked Jul 28 '11 09:07

piotr


People also ask

Can Python cause memory leaks?

The Python program, just like other programming languages, experiences memory leaks. Memory leaks in Python happen if the garbage collector doesn't clean and eliminate the unreferenced or unused data from Python.

Why Python memory consumption is high?

Those numbers can easily fit in a 64-bit integer, so one would hope Python would store those million integers in no more than ~8MB: a million 8-byte objects. In fact, Python uses more like 35MB of RAM to store these numbers. Why? Because Python integers are objects, and objects have a lot of memory overhead.

Is there a memory limit in Python?

Python doesn't limit memory usage on your program. It will allocate as much memory as your program needs until your computer is out of memory. The most you can do is reduce the limit to a fixed upper cap. That can be done with the resource module, but it isn't what you're looking for.


1 Answers

The pickler module is storing all objects it has seen in its memo, so it doesn't have to pickle the same thing twice. You want to skip this (so references to your objects aren't stored in your pickler object) and clear the memo before dumping:

def append(self, x):
    self.fd.seek(self.wpos)
    self.pickler.clear_memo()
    self.pickler.dump(x)
    self.wpos = self.fd.tell()
    self.size = self.size + 1

Source: http://docs.python.org/library/pickle.html#pickle.Pickler.clear_memo

Edit: You can actually watch the size of the memo go up as you pickle your objects by using the following append function:

def append(self, x):
    self.fd.seek(self.wpos)
    print len(self.pickler.memo)
    self.pickler.dump(x)
    self.wpos = self.fd.tell()
    self.size = self.size + 1
like image 143
combatdave Avatar answered Nov 11 '22 15:11

combatdave