Question

TL;DR Does it make sense to write multiple dumps for the same crash event, and if yes, what do you need to look out for.


We're using MiniDumpWriteDump to write a crash dump when there is a unhandled-exception / abort / younameit in our application.

The code so far actually writes two dumps:

  • One with MiniDumpWithDataSegs to get a small one that can be sent by even crappy email w/o problem once zipped.
  • A full one MiniDumpWithFullMemory to have the full info available should we need it.

To make this work, we call MiniDUmpWriteDump twice:

1 Open/create file for small dump
2 Write small dump
3 Open/create file for large dump
4 Write large dump

As far as I can tell, one additinoal idea of this scheme was that writing the small dump is faster. It's always subsecond basically, while writing the large dump often can take quite a few seconds, especially when the application is fully loaded and the large dump will easily be 1.2 GB or more.

The idea behind writing the small dump first, as far as I can tell, was that because it's faster, it would take a more detailed snapshot of the crashed process at the point in time it crashed, as the process is heavily multithreaded.

Obviously, the threads of the process continue to run between the end of the first call and the start of the second call to MDWP, so we do have quite some cases where the info in the small dump is actually more accurate than the info in the large dump.

After thinking about this, I would assume however, that to write the dump, MiniDumpWriteDump has to suspend the threads of the process anyway, so if we were to write the large dump first, we would have the large dump more accurate than the small one.

Question

Should we write the large dump before the small one? Should we even be writing two dumps? Could we somehow have the system first suspend the threads of the process and then write two dumps that are completely "synchronous"?

Was it helpful?

Solution

I ever analyzed dumps from various customers for a couple years, the followings is only my personal perspective to your question, hope this helps.

Should we write the large dump before the small one? i don't consider the order is important for crash, hang etc typical issues. the crash spot is there, the deadlock is there in dump, first captured or after.

Should we even be writing two dumps? i would suggest write at least 1 full dump, the small dump is very convenient for you to get an initial impression of what's the problem, but it's very limited esp. when your application crash. so you may suggest customer to email you the small dump to do first round triage, if this can not help you find the root cause, then ask the full dump. technically you can strip a small dump out from a full dump, however, you may not want your customer to do this sort of work for you. so this depends on how you interact with your customer.

Could we somehow have the system first suspend the threads of the process and then write two dumps that are completely "synchronous"?

technically this is doable. e.g. it's relatively easy to do out-proc, a simple NtSuspendProcess() suspend all target threads, but it have to be called from another process. if you prefer to do in-proc, you have to enumerate all threads and call SuspendThread(), this is how MiniDumpWriteDump() works. However, i think sync/asyn does not affect the accuracy of the dump.

OTHER TIPS

Writing two dumps at the same time is not advisable because MiniDumpWriteDump is not thread safe.

All DbgHelp functions, such as this one, are single threaded. Therefore, calls from more than one thread to this function will likely result in unexpected behavior or memory corruption.

Whether you should write a large dump with a small dump depends on your application and the kind of bugs you might expect. The minidump only contains stack information, it does not contain heap memory, handle information or recently unloaded modules.

Obtaining stack information will obviously give you a stack trace, but if the stack trace only tells you that your last action was to reference some memory on the heap the trace isn't much use. Your expected failure modes will dictate which makes more sense. If you have legacy code that maybe isn't using RAII to manage handles, or the handling of heap allocated memory isn't as managed as you'd like then a full dump will be useful.

You should also consider the person who will submit the memory dump. If your customers are on the Internet, they might not appreciate submitting a sizable memory dump. They might also be worried about the private data that may also be submitted along with a full memory dump. A minimdump is much smaller and easier to submit and less likely (though not impossible) to contain private data. If you're customers are running on an internal network, then a full memory dump is more acceptable.

It is better to write a minidump first and then a large dump. This way you are more likely to get some data out quickly, rather than waiting for a full dump. A full dump can take a while users are often impatient. They may decide to kill the process so they can get back to work. Also, if the disk is getting full (potentially the cause of the crash) it's slightly more likely that you have room for a minidump than a full dump.

DbgHelp.dll imports SuspendThread and ResumeThread. You can do the same thing. Call SuspendThread for all threads (minus the current of course), call MiniDumpWriteDump as many times as you need, then call ResumeThread on each thread you suspended. This should give you consistently accurate dumps.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top