Question

I'm trying to create some functionality that keeps an audit trail of how data in a given user form has been changed over time, with a dated audit at the foot of that page. For example:

02/04/09 21:49 Name changed from "Tom" to "Chris".

I'm doing this by storing the data in it's present format in the session and then on save checking whether there are any differences in the data being stored. If there are, I'm storing the data how it was before the latest edit in a table called history, and storing the new values in the current user table.

Is this the best approach to be taking?

Was it helpful?

Solution

One suggestion; this would be relatively easy to do in a database trigger. In that case, you would never have to worry about whether the code running the update remembers to add a history record.

OTHER TIPS

I'm not sure there is one "best approach", there are so many variables to take into consideration, including how far down the development path you are.

Having been through both code-based and db-trigger auditing solutions, I've listed some comments below; I hope you can see where you are now at (in terms of development) could affect these issues:

  • If you need to map the user who changed the data (which you normally do) then db triggers will need to get this information somehow. Not impossible, but more work and several ways to approach this (db user executing query, common user column in each table, etc.)
  • If you use db triggers and you rely on affected rows count returned from queries, then your audit triggers need to have this turned off, or your existing code modified to account for them.
  • IMHO db triggers offer more security, and offer an easier path to audit automation, however they are not foolproof, as anyone with appropriate access can disable the triggers, modify data and then enable them again. In other words, ensure your db security access rights are tight.
  • Having a single table for history is not a bad way to go, although you will have more work to do (and data to store) if you are auditing history for multiple tables, especially when it comes to reconstructing the audit trail. You also have to consider locking issues if there are many tables trying to write to one audit table.
  • Having an audit history table for each table is another option. You just need each column in the audit table to be nullable, as well as storing date and time of action (insert/update/delete) and the user associated with the action.
  • If you go with the single table option, unless you have a lot of time to spend on this, don't get too fancy trying to audit only on updates or deletes, although it may be tempting to avoid inserts (since most apps do this more often than updates or deletes), reconstructing the audit history takes a fair bit of work.
  • If your servers or data span multiple time-zones, then consider using an appropriate datetime type to be able to store and reconstruct the timeline, i.e. store audit event date in UTC as well as including the timezone offset.
  • These audit tables can get huge, so have a strategy if they start affecting performance. Options include table partitioning onto different discs, archiving, etc. basically think about this now and not when it becomes a problem :)

I've always been a fan of using one table instead of breaking it up into an "active" table and a "history" table. I put 4 columns on these tables, all timestamps: created, deleted, start, end. "created" and "deleted" are fairly self-explanatory. The "start" and "end" timestamps are for when the record was actually the "active" record. The currently-active record would have a "start" time prior to now() and a NULL "end" time. By separating out the "created" and "start" times, you can schedule changes to take place in the future.

This design, as opposed to the two-table design, allows you to easily write queries that will automatically operate on the right data. Suppose your table is storing the tax rate over time... you don't want to have all your queries that use tax rates in their calculations have the extra complexity of deciding to look stuff up in a history table when processing old invoices, for example... you can just look up the tax rate in effect at the time the invoice was created in one query, regardless of whether it's the current tax rate or not.

This idea is not originally mine (although I did re-invent the rough idea on my own prior to reading about it)... you can find a detailed discussion of it in this online book.

The session involvement makes me a little wary (are you sure you're handling it properly when two users are working on the same data at the same time?), but in general, yeah, keeping a history table is the right thing.

I would also think about a database trigger on insert or update to record change details (who, when, what, value before, value after) to a separate audit table. That way you know that even if the data is changed outide of your app using the database directly, it will still be picked up.

You might also want to do something to detect if the data is changed outide of your app, such as calculate a hash or crc of the record and store it in a field somewhere, then check it when reading the data.

I think your proposal would involve writing a lot of code/metadata to enable comparison of objects/records so you get a business-level audit.

Alternatively, a database trigger may not give you a high-enough level view of what happened. This may be acceptable if you use the audit so infrequently that the effort of recreating the business meaning is ok.

This also seems like a good application for AOP (Aspects), where you could use reflection on the object model to dump something meaningful without requiring a lot of metadata.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top