Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I profile memory of multithread program in Python?

Is there a way to profile memory of a multithread program in Python?

For CPU profiling, I am using the cProfile to create seperate profiler stats for each thread and later combine them. However, I couldn't find a way to do this with memory profilers. I am using heapy.

Is there a way to combine stats in heapy like the cProfile? Or what other memory profilers would you suggest that is more suitable for this task.

A related question was asked for profiling CPU usage over multi-thread program: How can I profile a multithread program in Python?

Also another question regarding the memory profiler: Python memory profiler

like image 565
Utku Zihnioglu Avatar asked Jan 25 '11 21:01

Utku Zihnioglu


3 Answers

There are ways to get valgrind to profile memory of python programs: http://www.python.org/dev/faq/#can-i-run-valgrind-against-python

like image 20
Foo Bah Avatar answered Sep 20 '22 14:09

Foo Bah


If you are happy to profile objects rather than raw memory, you can use the gc.get_objects() function so you don't need a custom metaclass. In more recent Python versions, sys.getsizeof() will also let you take a shot at figuring out how much underlying memory is in use by those objects.

like image 175
ncoghlan Avatar answered Sep 17 '22 14:09

ncoghlan


Ok. What I was exactly looking for does not seem to exist. So, I found a solution-a workaround for this problem.

Instead of profiling memory, I'll profile objects. This way, I'll be able to see how many objects exist at a specific time in the program. In order to achieve my goal, I made use of metaclasses with minimal modification to already existing code.

The following metaclass adds a very simple subroutine to __init__ and __del__ functions of the class. The subroutine for __init__ increases the number of objects with that class name by one and the __del__ decreases by one.

class ObjectProfilerMeta(type):
    #Just set metaclass of a class to ObjectProfilerMeta to profile object
    def __new__(cls, name, bases, attrs):
        if name.startswith('None'):
            return None

        if "__init__" in attrs:
            attrs["__init__"]=incAndCall(name,attrs["__init__"])
        else:
            attrs["__init__"]=incAndCall(name,dummyFunction)

        if "__del__" in attrs:
            attrs["__del__"]=decAndCall(name,attrs["__del__"])
        else:
            attrs["__del__"]=decAndCall(name,dummyFunction)

        return super(ObjectProfilerMeta, cls).__new__(cls, name, bases, attrs)

    def __init__(self, name, bases, attrs):
        super(ObjectProfilerMeta, self).__init__(name, bases, attrs)


    def __add__(self, other):
        class AutoClass(self, other):
            pass
        return AutoClass

The incAndCall and decAndCall functions use use global variable of the module they exist.

counter={}
def incAndCall(name,func):
    if name not in counter:
        counter[name]=0

    def f(*args,**kwargs):
        counter[name]+=1
        func(*args,**kwargs)

    return f

def decAndCall(name,func):
    if name not in counter:
        counter[name]=0

    def f(*args,**kwargs):
        counter[name]-=1
        func(*args,**kwargs)

    return f

def dummyFunction(*args,**kwargs):
    pass

The dummyFunction is just a very simple workaround. I am sure there are much better ways to do it.

Finally, whenever you want to see the number of objects that exist, you just need to look at the counter dictionary. An example;

>>> class A:
    __metaclass__=ObjectProfilerMeta
    def __init__(self):
        pass


>>> class B:
    __metaclass__=ObjectProfilerMeta


>>> l=[]
>>> for i in range(117):
    l.append(A())


>>> for i in range(18):
    l.append(B())


>>> counter
{'A': 117, 'B': 18}
>>> l.pop(15)
<__main__.A object at 0x01210CB0>
>>> counter
{'A': 116, 'B': 18}
>>> l=[]
>>> counter
{'A': 0, 'B': 0}

I hope this helps you. It was sufficient for my case.

like image 22
Utku Zihnioglu Avatar answered Sep 19 '22 14:09

Utku Zihnioglu