Question

I'm actually modeling a datawarehouse

In one of my fact table I have 3 data (number that I want to analyse). The problem is that I'll first fill in one of the number, and later on (few days) fill in the 2 others numbers.

Is it a bad thing to do it in a DW (because of the "no modify table" law)?

The other solution that I thought of, is to put the first number in a first fact table and the 2 others in a second fact table. The 2 FT will be linked to the same dimension tables of course. This solution seems good to me, but perhaps a bit heavier to compare data later.

--

The data I talk about are about the working time. First the employee put his work time (it's non-validated) in the DB, it's my first attribute (Qe). Then the boss validate or modify (or not) this data and it gives me another attribute (Qa) Sometimes both attributes will be loaded in the DW at the same time (if validated quickly), sometimes not

So what do you think of it, which solution is better / cleaner?

thks for your help

Was it helpful?

Solution

There's no law about modifying a fact table. If it's an accumulating snapshot that is tracking a process as it flows from one step to another, then the standard Kimball method is to update the record as it is modified.

If it's a transactional fact table where the measurements are taken all at the same time for one row, then updating is bad.

In your case, it makes a lot of sense to have an accumulating snapshot to measure this data, since it represents a "workflow", where there's an approval step before the entered results become the "truth".

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top