문제

I have a pointer (**A) to another pointer (*B). Somewhere in the program, B is getting corrupted. However, B is corrupted before it is stored in *A. B is not always located at the same address (&B is not always the same). But B is always corrupted with the same value (B is always the same). I know this because I have a corruption detection routine that looks at the next value to be stored on *A, so I know only the point at which it was already corrupted.

What I want (of course) is to find out where it is that B is getting corrupted.

I've already tried a gdb watchpoint on A that automatically generates a watchpoint on whatever A points to (*A), in the hopes of catching B when it is first stored on *A, before it is no longer stored on *A, becomes corrupt, and then is stored back on *A.

watch A
commands
  silent
  watch *A
  commands
    silent
    if *A == magicalcorruptedvalue
      where
    end
  end
end

But the problem is that with hardware watchpoints, too many intermediate things are getting stored on *A, and so I quickly run out of watchpoints. I haven't tried software watchpoints, as they don't work well with threads.

At this point, I'm thinking the only solution is either to go back and read the code more carefully (always a decent option), build more unit tests, or make a dedicated thread that just scans all allocated memory continuously looking for this value.

However, I suspect I'm not the first to run into this problem. A more general way of phrasing this question might be: what are techniques for debugging buffer overruns when all the easy techniques fail?

Meta-parameters:

  • On Linux

  • This is in a multithreaded callback-style application.

도움이 되었습니까?

해결책

I've come up with two answers, one of which actually worked for me.

  1. Use the right tool for the job. gdb is for debugging program flow. valgrind seems much better suited for debugging memory/buffer errors. Running the program with valgrind, I found the bug(s) pretty quickly.

  2. There is actually a way to do this in gdb in theory. In practice it would only be fast enough if *A changed relatively rarely. Not the 10k+ times it was changing in this particular program.

Here it is:

set $A = (void ***) &whateveritis
set $B = (void **) 0
set $WPN = 2
set $WP = 0
watch *$A
commands
  silent
  set $B = *$A
  if $WP != 0
    delete $WP
    set $WP = 0
  end
  if $B != (void *) 0
    watch *$B
    commands
      silent
      if *$B == magicalcorruptedvalue
        where
      else
        continue
      end
    end
    set $WP = $WPN++
  end
  continue
end

This sets a watchpoint on A, deletes whatever previous watch existed on the previous *A (which will eventually be set to B), then sets the watch for the current *A. The watch always is looking for magicalcorruptedvalue, although in my case any change of B at all while it was stored in *A would have been an error, so I omitted that whole part.

Note that $WPN is the number of the next breakpoint. Beware of it changing due to temporary implicit breakpoints like that created via start.

This should work, but note that in my case the program was so bogged down that it never made it to the corrupt section, and so the watchpoint for B was never tripped. YMMV.

Back to the real world where nobody ever does anything this complex with gdb. The lessons here are: (1) Learn what tools are out there. Valgrind is amazing, and I'm totally unsure how any C programmer, myself included, has ever coped without it. (2) Think carefully about the problem domain and its relationship to the tool you're trying to use. If the tool doesn't fit, use or write something else. Don't let language fool you: just because you're "debugging" doesn't mean you should do it with a "debugger."

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top