Identifying Duplicate Exceptions for Bug Tracking

https://stackoverflow.com/questions/285835

bug-tracking

08-07-2019
|

Question

I've developed a "Proof of Concept" application that logs unhandled exceptions from an application to a bug-tracking system (in this case Team Foundation Server, but it could be ANY bug tracking system). A limitation of this idea is that I don't want duplicate Bug Items opened every time the same exception is thrown (for example, many users encounter the exception - it's still a single "bug").

My first attempt was to store the Exception Type, Message and Stack Trace as fields in the Bug Tracking System.The logging component would then do a query against the Bug "Store" to see if there is an open bug with the same information. (This example is .NET - but I would think the concept is platform independant).

The problem obviously is that these fields can be very large (particularly the stack trace) - and requires a "Full-Text" type of implementation to store them and the searching is very expensive.

I was wondering what approaches have been defined for this problem. I had heard that FogBugz for example had such a feature for automated bug tracking, and was curious how it was implemented.

Solution

You could create a checksum hash of the stack trace and store that as an indexed column. That way the query to the Bug Store would be pretty fast to avoid duplicates on insert.

OTHER TIPS

If you have the stack trace, you could find the last statement in the stack trace and compare it with the ones already logged. If the symbols were included, you'd also get the line number. So, now you have two things for comparison, the actual error number and the statement that failed and possibly the actual line number. If something has already been logged with all of those, then it's more than likely (not 100%, of course) the same issue.

In fact, you could probably parse the stack trace with the "at" word, as each line in the stack trace begins with "at". So, look for the last "at", get that line, compare it with the same last "at" line of the stored stack traces, and you might actually have something.

HTH!

You could look at the source code for one of the existing open-source solutions that aggregate exceptions.

For example: https://github.com/getsentry/sentry/tree/master/src/sentry

It is not a simple problem and there are complex heuristics (e.g. same exception reported different ways on different browsers, e.g. exceptions caused by browser extensions are common and are rarely important).

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow