Question

I have to run a legacy Zope2 website and have some grievance with it. The biggest issue is that, occasionally, it just locks up, running at 100% CPU load and not answering to requests anymore. While the problem isn't reproducible on a regular basis, one page containing 3 dynamic graphs triggers it sometimes, so I suspect some kind of race condition that leads to an endless loop or a stuck busywait.

The problem is, I have not yet found a way to debug this thing. There's nothing in the Zope logs and nothing in the system logs. I tried the suggestions from this question to get a stacktrace, but the only signal that has any effect is SIGKILL.

Is there another possibility to find out where exactly the process is when it gets stuck?

Was it helpful?

Solution

If the process is stuck in a way that no other signal gets through, you might want to consider running it from a debugger, instead of trying to attach to it at runtime.

Also, it might be useful to other debugging tactics, like turning off certain parts of the code to find out the minimal case in which it is still reproducible in order to see what causes it better.

OTHER TIPS

You can print out a nice stack trace using pyrasite.

First, you'll need to have gdb installed.

# Redhat, CentOS, etc
$ yum install gdb

# Ubuntu, Debian, etc
$ apt-get update && apt-get install gdb

Then, install pyrasite.

$ pip install pyrasite

Use ps or some other method to find the process ID for the stuck python process and run pyrasite-shell with it.

# Assuming process ID is 12345
$ pyrasite-shell 12345

You should now see a python REPL. Run the following in the REPL to see stack traces for all threads.

import sys, traceback
for thread_id, frame in sys._current_frames().items():
    print 'Stack for thread {}'.format(thread_id)
    traceback.print_stack(frame)
    print ''

See my answer to this SO question, use Products.signalstack. It registers the same handler as the answer you already found, at Product registration time. Perhaps it works better for you.

If not, you probably have a OS-level I/O problem on your hands, and your only hope is attaching gdb to the process. Search Stack Overflow for gdb answers; there is a wealth of information here!

You could try to attach a debugger to the running process. See also this question.

after running around the internet in circles for a while I finally ended up here: http://podoliaka.org/2016/04/10/debugging-cpython-gdb/ - describes in detail how all the pieces fit together. the money quote for me was 'gdb /usr/bin/python -p $PID' - the name of the executable is required in order for gdb to find the correct debug info files.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top