Monitor and Terminate Python script based on system resource use

https://stackoverflow.com/questions/2573736

24-09-2019
|

Question

What is the "right" or "best" way to monitor the system resources a python script is using and terminate it if the resource use exceeds some predetermined values. In my case memory usage is of concern. I am not asking how to measure the system resource use although I am open to suggestions.

As a simple example, let's assume I have a function that finds prime numbers less than some large number and adds them to a list based on some condition. I don't know ahead of time how many prime numbers will satisfy the condition so I what to be sure to terminate the function if I use up to much system memory (8gb lets say). I know that there are ways to monitor the size of python objects. What I don't know is the proper way to monitor the size of the list and exit is to just include a size test in the prime function loop and exit if it exceeds 8gb or if there is an "external" (by external I mean external to the loop but still within or part of the python script) way to monitor and exit.

In my case I am running on a mac but am asking the question in general.

Solution

On Unix-like system, a useful "external" way to monitor any process is the ulimit command (you don't clarify whether you want instead to run in Windows, where ulimit doesn't exist and other approaches may, but I don't know them;-).

If you're thinking about performing such controls inside your own Python programs, just change the function in question to check the size of each object it's appending to the list (and keep a running total) and return when the running total reaches or exceeds a threshold (which you could pass as an extra parameter to the function in question).

Edit: the OP has clarified in a comment that they want the monitoring in the very worst place it could possibly be placed -- in the previous paragraphs, I mentioned how it's easy outside of the process, easy inside the function, but the OP wants it "smack in the middle";-).

Least-bad way is probably with a "watchdog thread" -- a separate daemon thread in an infinite loop which, every X seconds, checks the process's resource consumption (e.g. with resource.getrusage, if on Unix-like machines -- again, if on Windows, something else is needed instead) and, if that consumption exceeds the desired limits, attempts to kill the main thread with thread.interrupt_main. Of course, this is fail from foolproof: the periodicity X (like in all cases of "polling") must be low enough to stop a runaway process in the meantime, but high enough to not slow the process down to a crawl. Plus, the main thread (the only one that can be interrupted like this) might be blocking exceptions (in which case the watchdog thread might perhaps try with "signals to this very process" of growing severity, all the way up to SIGKILL, the killer-signal that can never be blocked or intercepted).

So, this intermediate approach is a lot more work than the ulimit command, is more fragile, and has no substantial added value. But, if you want to put the monitoring "inside the process but outside the resource-consuming function", with no advantages, lots of work, and the other disadvantages I've mentioned, this is the way to do it.

OTHER TIPS

resource.getrusage() (in particular ru_idrss) can give you the resource usage of the current python interpreter, which you can use as a sentinel to stop processing.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow