Categorization and grouping of Android and iOS crash reports

https://stackoverflow.com/questions/19800877

04-07-2022
|

题

Various mobile applications we support have crash reporting as an added feature which submits more data to us than the normal device-provided method. We support both iOS and Android apps. This information is sent to us and we shove it into a MySQL database.

This was the first step of the design. Now we want the ability to categorize, group, and count these crash reports by stack trace, device type, app version, OS version, and so on.

We currently are using a MySQL database, as mentioned, but there is no reason we could not move to a different database if it provides better support for what we're trying to do. We are moving our system to AWS, so DynamoDB would be the obvious second choice.

So, before I go any further, if you have any suggestions, please answer now.

More details:

We currently have the following data sent to us:

signal (eg. SIGSEGV)
exception name (eg. java.lang.NullPointerException or NSInvalidArgumentException
exception description (eg. "Unable to instantiate activity..." or "The string argument is NULL")
application name
handheld device type (eg. samsung/m0/GT-I9300 or iPad)
native stack trace (for Android crashes in native code)
OS version (eg. 4.1.1 (SDK Level 16) or 6.1.3)
User ID (if available)
Application version
crash timestamp
stack trace
submission date
other irrelevant data

I am able to group Java stack traces together to some degree using GROUP BY which works surprisingly well... for smaller datasets. But when you have ~300,000 crash logs, it sort of grinds to a halt.

My first thought is to create a separate table for stack traces, include an SHA hash column and add an index to it, which would just be a hash of the stack trace. I could then find or create a stack trace row as necessary. I don't know if this will be faster than simply relying on the database server to do the comparison on the stack trace strings directly. I could include a counter column to count how often each stack trace occurs, although it may be better to simply keep count of those by select count(*) FROM crash_reports GROUP BY fkStackTraceID, so that I could additionally filter by date or application versions.

Currently, this all falls apart when trying to do the same thing with iOS crash logs, or with native Android crash logs. Each one is distinct, due to the inclusion of the memory location of each stack trace each element. I can go to the trouble of finding the offset (which is also included) and subtracting it, which will help.

So some questions:

Are there any other methods of filtering the data to be more easily queryable in whatever way you think would be useful for stack traces? I want to get things right the first time, so any additional ways to separate the data for querying that I can't think of now would be to get in right away.
Is MySQL the best option for this, or would a NoSQL option (i.e. DynamoDB) be more useful?
My previous question again: Are there any prepackaged solutions that do this (or help do this), which function in a manner similar to the crash log sections of the Google Play console and/or the iTunes Connect site?

解决方案

Found a recent pre-built solution:

http://www.hockeyapp.net

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow