Question

Generic question regarding Python-code. How can I most effectively locate the worst parts of my Python-code with respect to memory usage?

See e.g. this small example

def my_func():
    a = [1] * (12 ** 4)
    return a

def my_func2():
    b = [2] * (10 ** 7)
    return b

if __name__ == '__main__':
    a1 = my_func()
    a2 = my_func2()

How can I in an automated way tell that a2 is much larger that a1 in size? And how can I - still automated - root this back towards my_func1() and my_func2()?

For C/C++ code I would use valgrind --tool=massif, which can directly locate the heavy-weights regarding memory usage - but for Python I need your help. Meliae appears to give some of the answer, but not nearly as good as massif does for C/C++.

Was it helpful?

Solution

locals() (resp. globals()) returns a dictionary with all the local (resp. global) alive objects. You can use them like this:

import sys
sizes = dict((obj, sys.getsizeof(eval(obj))) for obj in locals().keys())

The drawback is that it would not be aware of objects that don't have fully implemented __getsizeof__, like Numpy arrays, or references. For example, if you do:

print sys.getsizeof(a2)
sys.getsizeof(a1)
a2.append(a1)
print sys.getsizeof(a2)

The output will be:

40000036
   82980
45000064   ---> The list is 60 times bigger!

And, of course, just deleting a1 will not free its 82 k, because there is still a reference in a1. But we can make it even weirder:

a2 = my_func2()
print sys.getsizeof(a2)
a2.append(a2)
print sys.getsizeof(a2)

And the output will look strangely familiar:

40000036
45000064

Other tools may implement workarounds on this, and search the reference tree, but the general problem of a full memory analysis in Python remain unsolved. And this just gets worse when objects store data via the C API, outside of the scope of the reference counter, which e.g. happens with Numpy arrays.

That said, there are tools that are "good enough" for most practical situations. As in the referenced link, Heapy is a very good option.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top