Question

Everything I am talking about relate to relational database, specific MySQL.

I have a number of tables in a database and for a moderate number of them, I am going to want to store a history of the records values when it changes. I have seen this done in a couple of different ways:

  1. One Table/One Field - Basically there is one table that store the history of all the table that need history storage. All change are recorded in one field as a text data type.
  2. Table Per Table/One Field - Same as above except the each table has its own history table (ie. Projects/ProjectsHistory, Issues/IssuesHistory, etc...).
  3. Table Per Table/Field Per Field - This is like the above in the each table has it own histroy table but also the history table has pretty much the same definition as the regular table with an additional of additional history related fields (updateDatetime, updateUserId, etc...).

What are some of the advantages and disadvantages to these different methods of storing record history? Are there other methods that I have not thought of?

Was it helpful?

Solution

Since you've got many tables with varying numbers of columns, #1 would be out since you'd have a massive table with the aggregate of all your columns, and lots of nulls.

Between #2 & #3 I think you have a decision to make regarding the design complexity you want to manage. My view is that it would be easier to maintain an exact archive replica for a given table, and store the whole rowstate (with modified time). Think of a case where you update more than one column of a row. In that case #2 would log an entry for the change of the columns separately, even though it was the same transaction. I'd go w #3 for reducing complexity, and capturing point in time row state.

OTHER TIPS

You really want to read first:

http://www.cs.arizona.edu/~rts/tdbbook.pdf

If you want to keep your history in a separate table (and probably you do), you will probably want option #3; it tends to be the easier to implement and more convenient, #1 are #2 are pretty ugly and "un-relational".

To some extent it depends on how you intend to use that data. If this is an audit table used strictly to be able to occasionally research who changed what and when and occasionally restore bad changes, then option 2 is fine. If you intend to display user history to the application or through reporting, use option 3. I can think of no circumstance where I would use option one as it beciomes a places where blocking happens.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top