Question

Apologies if this is not the right SE site to ask.

I'm designing an application for catalogue management. Any element in the hierarchy has to be versioned, i.e. when I change a property of it, a new version is created. The new version can also have different children than the previous. The user can switch between versions when she manages the catalogue, so I'll need to list all the existing versions of a child. When switching version, all the descendant will be updated).

My first thought was something like

| A            |  | B            |  | C            |
|--------------|  |--------------|  |--------------|
| ID  (k)      |  | ID  (k)      |  | ID  (k)      |
| VER (k)      |  | VER (k)      |  | VER (k)      |
| ...          |  | A_ID (fk)    |  | B_ID (fk)    |
|              |  | A_VER (fk)   |  | B_VER (fk)   |
|--------------|  | ...          |  | ...          |
                  |--------------|  |--------------|

Where the entities' key is always something like (ID, VER) and the parent's fk is (P_ID, P_VER).

Is this a good way to implement it?

Update: It's apparently more complicate than that. The meaning of the versioning is that elements have a validity time and we need to keep the older version. E.g. a client buys product with price x. After 1 January it's gonna cost y. New clients after 1 January get new price y, but old client keeps paying x.

Adding a timestamp column is probably the best option here.

Was it helpful?

Solution

To be quite honest your requirements sound like a cross between a nightmare and a Version Control System. I would highly recommend looking at popular solutions like Git instead of reimplementing a DVCS with a relational database. Your application could effectively be a nice skin around Git. If you have to do this via a relational database then I think versioning by time stamps is going to be your best chance of success.

The basic strategy is to put a time stamp on every entity. Then when you need to go to the most recent version you select all the entities through the relational chain with the most recent time stamp. When you need to get a specific version you grab all the most recent versions of the entities that are at or older than the version time stamp you want.

Versioning by time stamp prevents you from copying all the older entities to a new version id even if those entities haven't changed. It also saves you the madness of keeping track of those version ids.

This might lead to slow query times but you can play with it to see if it suits your needs. To reiterate I don't think a relational database is the best solution, but sometimes life forces us to put the square peg in the round hole. If that is your lot in life I wish you the best.

Edit: Versioning is related to historical pricing

I agree that time stamps are the way to go given that versioning is related to historical pricing. To handle that you just need an Effective Start Date column. To find the rate at any point in time you just need to find the most recent date that isn't in the future. For example:

declare @Date DATE = SYSUTCDATETIME(); -- Or any other date for historical/future data
;with [CurrentPrice_CTE] AS -- You don't really need this semi-colon but I added it as a defensive habit.
(
    select
      [PriceId] -- Price table's PK
      max([EffectiveStartDate]) AS [EffectiveStartDate]
    from [dbo].[Price]
    where
      [EffectiveStartDate] <= @Date
)
select
  [p].* -- Select the columns you need
from [dbo].[prices] [p]
    inner join [CurrentPrice_CTE] [cp] as [cp].[PriceId] = [p].[PriceId] and [cp].[EffectiveStartDate] = [p].[EffectiveStartDate]

Just make sure you make the start date unique for each product or that code will return more results than you want. To be totally honest this sounds like a much less complicated problem than having to reversion the entire database every update. :)

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top