gdb backtrace and pthread_cond_wait()

https://stackoverflow.com/questions/1843765

12-09-2019
|

Question

This is on a Redhat EL5 machine w/ a 2.6.18-164.2.1.el5 x86_64 kernel using gcc 4.1.2 and gdb 7.0.

When I run my application with gdb and break in while it's running, several of my threads show the following call stack when I do a backtrace:

#0  0x000000000051d7da in pthread_cond_wait ()
#1  0x0000000100000000 in ?? ()
#2  0x0000000000c1c3b0 in ?? ()
#3  0x0000000000c1c448 in ?? ()
#4  0x00000000000007dd in ?? ()
#5  0x000000000051d630 in ?? ()
#6  0x00007fffffffdc90 in ?? ()
#7  0x000000003b1ae84b in ?? ()
#8  0x00007fffffffdd50 in ?? ()
#9  0x0000000000000000 in ?? ()

Is this a symptom of a common problem?
Is there a known issue with viewing the call stack while waiting on a condition?

Solution

The problem is that pthread_cond_wait is written in hand-coded assembly, and apparently doesn't have proper unwind descriptor (required on x86_64 to unwind the stack) in your build of glibc. This problem may have recently been fixed here.

You can try to build and install the latest glibc (note: if you screw up installation, your machine will likely become unbootable; approach with extreme caution!), or just live with "bogus" stack traces from pthread_cond_wait.

OTHER TIPS

Generally, synchronization is required when multiple threads share a single resource. In such a case, when you interrupt the program, you'll see only 1 thread is running (i.e., accessing the resource) and other threads are waiting within pthread_cond_wait().

So I don't think pthread_cond_wait() itself is problematic.

If your program hangs with deadlock or performance doesn't scale, it might be caused by pthread_cond_wait().

That looks like a corrupt stack trace to me

for example:

#9  0x0000000000000000 in ?? ()

There shouldn't be code at NULL

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow