Question

At the end of a script, I would like to return the peak memory usage. After reading other questions, here is my script:

#!/usr/bin/env python
import sys, os, resource, platform
print platform.platform(), platform.python_version()
os.system("grep 'VmRSS' /proc/%s/status" % os.getpid())
print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
dat = [x for x in xrange(10000000)]
os.system("grep 'VmRSS' /proc/%s/status" % os.getpid())
print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss

and here is what I get:

$ test.py
Linux-2.6.18-194.26.1.el5-x86_64-with-redhat-5.5-Final 2.7.2
VmRSS:      4472 kB
0
VmRSS:    322684 kB
0

Why is resource.getrusage always returning me 0?

The same thing happens interactively in a terminal. Can this be due to the way Python was specifically installed on my machine? (It's a computer cluster I'm using with others and managed by admins.)

Edit: same thing happen when I use subprocess; executing this script

#!/usr/bin/env python
import sys, os, resource, platform
from subprocess import Popen, PIPE
print platform.platform(), platform.python_version()
p = Popen(["grep", "VmRSS", "/proc/%s/status" % os.getpid()], shell=False, stdout=PIPE)
print p.communicate()
print "resource:", resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
dat = [x for x in xrange(10000000)]
p = Popen(["grep", "VmRSS", "/proc/%s/status" % os.getpid()], shell=False, stdout=PIPE)
print p.communicate()
print "resource:", resource.getrusage(resource.RUSAGE_SELF).ru_maxrss

gives this:

$ test.py
Linux-2.6.18-194.26.1.el5-x86_64-with-redhat-5.5-Final 2.7.2
('VmRSS:\t    4940 kB\n', None)
resource: 0
('VmRSS:\t  323152 kB\n', None)
resource: 0
Was it helpful?

Solution

Here's a way to replace the ´os.system´ call

In [131]: from subprocess import Popen, PIPE

In [132]: p = Popen(["grep", "VmRSS", "/proc/%s/status" % os.getpid()], shell=False, stdout=PIPE)

In [133]: p.communicate()
Out[133]: ('VmRSS:\t  340832 kB\n', None)

I also have no issue running the line you felt you have problems with:

In [134]: print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
340840

Edit

The rusage issue could well be a kernel dependent issue and simply not available on your red hat dist http://bytes.com/topic/python/answers/22489-getrusage

You could of course have a separate thread in your code looking at the current usage and storing throughout the execution of the code and store the highest value observed

Edit 2

Here's a full solution skipping resource and monitoring usages via Popen. The frequency of checking must of course be relevant but not frequent so that it eats all cpu.

#!/usr/bin/env python

import threading
import time
import re
import os
from subprocess import Popen, PIPE

maxUsage = 0
keepThreadRunning = True


def memWatch(freq=20):

    global maxUsage
    global keepThreadRunning

    while keepThreadRunning:

        p = Popen(["grep", "VmRSS", "/proc/%s/status" % os.getpid()],
                  shell=False, stdout=PIPE)

        curUsage = int(re.search(r'\d+', p.communicate()[0]).group())

        if curUsage > maxUsage:

            maxUsage = curUsage

        time.sleep(1.0 / freq)


if __name__ == "__main__":

    t = threading.Thread(target=memWatch)
    t.start()

    print maxUsage
    [p for p in range(1000000)]
    print maxUsage
    [str(p) for p in range(1000000)]
    print maxUsage
    keepThreadRunning = False
    t.join()

The memWatch function can be optimized by calculating the sleep time once, not reformatting the path to the process each loop and compiling the regular expression before entering the while loop. But in all I hope that was the functionality you sought.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top