Question

I hesitate between various alternative when it comes to relations that have "historical"
value.

For example, let's say an User has bought an item at a certain date... if I just store this the classic way like:

transation_id: 1
user_id: 2
item_id: 3
created_at: 01/02/2010

Then obviously the user might change its name, the item might change its price, and 3 years later when I try to create a report of what happend I have false data.

I have two alternative:

  1. keep it stupid like I shown earlier, but use something like https://github.com/airblade/paper_trail and do something like:

    t = Transaction.find(1);
    u = t.user.version_at(t.created_at)
    
  2. create a database like transaction_users and transaction_items and copy the users/items into these tables when a transaction is made. The structure would then become:

    transation_id: 1
    transaction_user_id: 2
    transaction_item_id: 3
    created_at: 01/02/2010
    

Both approach have their merits, tho solution 1 looks much simpler... Do you see a problem with solution 1? How is this "historical data" problem usually solved? I have to solve this problem for 2-3 models like this for my project, what do you reckon would be the best solution?

Was it helpful?

Solution 2

I'll went with PaperTrail, it keeps history of all my models, even their destruction. I could always switch to point 2 later on if it doesn't scale.

OTHER TIPS

Taking the example of Item price, you could also:

  1. Store a copy of the price at the time in the transaction table
  2. Creating a temporal table for item prices

Storing a copy of the price in the transaction table:

TABLE Transaction(
 user_id      -- User buying the item
,trans_date   -- Date of transaction
,item_no      -- The item
,item_price   -- A copy of Price from the Item table as-of trans_date
)

Getting the price as of the time of transaction is then simply:

select item_price
  from transaction;

Creating a temporal table for item prices:

TABLE item (
   item_no
  ,etcetera -- All other information about the item, such as name, color
  ,PRIMARY KEY(item_no)
)

TABLE item_price(
   item_no
  ,from_date
  ,price
  ,PRIMARY KEY(item_no, from_date)
  ,FOREIGN KEY(item_no)
      REFERENCES item(item_no)
)

The data in the second table would looke something like:

ITEM_NO  FROM_DATE   PRICE
=======  ==========  =====
   A     2010-01-01  100
   A     2011-01-01  90
   A     2012-01-01  50
   B     2013-03-01  60

Saying that from the first of January 2010 the price of Article A was 100. It changed the first of Januari 2011 to 90, and then again to 50 from the first of January 2012.

You will most likely add a TO_DATE to the table, even though it's a denormalization (the TO_DATE is the next FROM_DATE).

Finding the price as of the transaction would be something along the lines of:

select t.item_no
      ,t.trans_date
      ,p.item_price
  from transaction t
  join item_price  p on(
       t.item_no = p.item_no
   and t.trans_date between p.from_date and p.to_date
  );


ITEM_NO TRANS_DATE PRICE
======= ========== =====
   A    2010-12-31  100
   A    2011-01-01   90
   A    2011-05-01   90
   A    2012-01-01   50
   A    2012-05-01   50
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top