Question

Is there a way to profile memory of a multithread program in Python?

For CPU profiling, I am using the cProfile to create seperate profiler stats for each thread and later combine them. However, I couldn't find a way to do this with memory profilers. I am using heapy.

Is there a way to combine stats in heapy like the cProfile? Or what other memory profilers would you suggest that is more suitable for this task.

A related question was asked for profiling CPU usage over multi-thread program: How can I profile a multithread program in Python?

Also another question regarding the memory profiler: Python memory profiler

Was it helpful?

Solution

If you are happy to profile objects rather than raw memory, you can use the gc.get_objects() function so you don't need a custom metaclass. In more recent Python versions, sys.getsizeof() will also let you take a shot at figuring out how much underlying memory is in use by those objects.

OTHER TIPS

There are ways to get valgrind to profile memory of python programs: http://www.python.org/dev/faq/#can-i-run-valgrind-against-python

Ok. What I was exactly looking for does not seem to exist. So, I found a solution-a workaround for this problem.

Instead of profiling memory, I'll profile objects. This way, I'll be able to see how many objects exist at a specific time in the program. In order to achieve my goal, I made use of metaclasses with minimal modification to already existing code.

The following metaclass adds a very simple subroutine to __init__ and __del__ functions of the class. The subroutine for __init__ increases the number of objects with that class name by one and the __del__ decreases by one.

class ObjectProfilerMeta(type):
    #Just set metaclass of a class to ObjectProfilerMeta to profile object
    def __new__(cls, name, bases, attrs):
        if name.startswith('None'):
            return None

        if "__init__" in attrs:
            attrs["__init__"]=incAndCall(name,attrs["__init__"])
        else:
            attrs["__init__"]=incAndCall(name,dummyFunction)

        if "__del__" in attrs:
            attrs["__del__"]=decAndCall(name,attrs["__del__"])
        else:
            attrs["__del__"]=decAndCall(name,dummyFunction)

        return super(ObjectProfilerMeta, cls).__new__(cls, name, bases, attrs)

    def __init__(self, name, bases, attrs):
        super(ObjectProfilerMeta, self).__init__(name, bases, attrs)


    def __add__(self, other):
        class AutoClass(self, other):
            pass
        return AutoClass

The incAndCall and decAndCall functions use use global variable of the module they exist.

counter={}
def incAndCall(name,func):
    if name not in counter:
        counter[name]=0

    def f(*args,**kwargs):
        counter[name]+=1
        func(*args,**kwargs)

    return f

def decAndCall(name,func):
    if name not in counter:
        counter[name]=0

    def f(*args,**kwargs):
        counter[name]-=1
        func(*args,**kwargs)

    return f

def dummyFunction(*args,**kwargs):
    pass

The dummyFunction is just a very simple workaround. I am sure there are much better ways to do it.

Finally, whenever you want to see the number of objects that exist, you just need to look at the counter dictionary. An example;

>>> class A:
    __metaclass__=ObjectProfilerMeta
    def __init__(self):
        pass


>>> class B:
    __metaclass__=ObjectProfilerMeta


>>> l=[]
>>> for i in range(117):
    l.append(A())


>>> for i in range(18):
    l.append(B())


>>> counter
{'A': 117, 'B': 18}
>>> l.pop(15)
<__main__.A object at 0x01210CB0>
>>> counter
{'A': 116, 'B': 18}
>>> l=[]
>>> counter
{'A': 0, 'B': 0}

I hope this helps you. It was sufficient for my case.

I've used Yappi, which I've had success with for a few special multi-threaded cases. It's got great documentation so you shouldn't have too much trouble setting it up.

For memory specific profiling, check out Heapy. Be warned, it may create some of the largest log files you've ever seen!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top