How should I store changes to items in the database?

https://dba.stackexchange.com/questions/96989

19-12-2020
|

Вопрос

I have items (tracked with itemid) that have attributes assigned to them (field1,field1 for example). At any given time, field1=val1, field2=val1, ..., fieldn=valn where val1,val1,...,valn = 1..100. and n = 1..20

I need to design a DB to track the changes for each field, and I'm not sure if I should list each field as a column (for example table1), or list each change as a row (for example table2).

create table1 (datetimeofchange datetime2, itemid bigint, field1 smallint, field2 smallint, field3 smallint, field4 smallint, ..., fieldn smallint)

create table2 (datetimeofchange datetime2, itemid bigint, fieldval smallint, changeval smallint)

create table2lookup (fieldval smallint, fieldname varchar(50))

What are the trade-offs I get from choosing between the two designs? Is there another design which would have different trade-offs? Is there some terms I can google to research this issue?

Thanks.

Решение

I'm no expert but I see couple of potential pros and cons.

So if you use table1, and you list each row for after every data change, what will happen? Well if you have lot of small updates like only changing one field at a time, then you will have a lot of redundant data from the unchanged fields. If you only list the fields that have changed and leave the rest NULL, it might be tricky to figure out what your data was like, but not a bad solution. Table1 is ideal for changes to multiple columns

Table2 on the other hand is the opposite. Let's say you update 10 fields for 1,000 rows. Then in table 2, you get 10,000 rows. Each data change also ends up in 10 rows. If you are updating only one field at time for the most part, it will be quite comparable to table1. Table2 works better with updates that only affect one or tops a few columns at a time.

Personally, I'd use the table1 approach. It's simpler and more straightforward. If you do update multiple columns, it only records one row. And if you record only when a field is changed, then you eliminate storing redundant data although your scripts to figure out what the data changes were might be a more complex. On the other hand, if you only have maybe a few hundred thousand rows, then storing redundant data might not be so bad, but if you have a hundred million rows, then you might have some issues.

Hope this helps at least a little bit.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с dba.stackexchange