SCD Type 2 Dimension -> Is this the correct layout for a type 2 scd with this type of data present?

https://dba.stackexchange.com/questions/17272

22-10-2019
|

Question

Here is an example picture of my layout.

enter image description here

As you can see I have my SCD types present (status/startdate/enddate/businesskey).

My key is a surrogate key that identifies each record.

My problem is that my hierarchy seems to error out and it may be due to an incorrect layout. Here is my current key column structure.

Hierarchy

LineOfBusinessId (key column: LineOfBusinessId) (name column: LineOfBusinessName)

WorkerDivisionId (key column: LineOfBusinessId, WorkerDivisionId) (name column: WorkerDivisionName)

WorkerId (key column: LineOfBusinessId, WorkerDivisionId, WorkerId) (name column: WorkerName)

This dimension errors out because a workername changes and when the select distinct occurs during full processing it finds duplicates. I am just wondering what the best way of setting up the key columns is for this specific situation. Do the names need to be part of the key columns or is my hierarchy completely incorrect. I was sure I had this set up correctly but I am obviously questioning myself now.

I have received some advice that the best thing to do in this situation is to use the surrogate key as part of my key columns, but then found out that this will cause issues as I only want one record to show up when querying my SCD. Later, I received some more advice saying that I shouldn't use the surrogate in the id key columns but in the name key columns, specifically at the Worker level.

I am rather skiddish of updating all of my cubes with this setup without hearing from someone who has made a hierarchy of this type. When I look at this setup, I have to think that others have made hierarchies of this type as it is rather simple. Any advice/direction on this is greatly appreciated.

Thanks.

Solution

Making SSAS hierarchies on slowly changing dimensions is a bit of a fiddle. You need to make surrogate keys for each historical version at each level of the hierarchy. Then the key has the actual business facing name, which the user selects or reports by.

As an example, imagine worker BloggsJ in Division1, which is in LineOfBusiness1. Now Division1 gets moved to LineOfBusiness2. Logically you have the Division entity with two rows now:

DivisionKey     Division     LineOfBusinessKey
          1     Division1                   11 
          2     Division2                   12
          3     Division1                   12

and

LineOfBusinessKey    LineOfBusiness
               11    LineOfBusiness1
               12    LineOfBusiness2

Now, we have worker BloggsJ, who is assigned to Division 1, which is subsequently moved

WorkerKey  WorkerName   DivisionKey   Division   LineOfBusinessKey   LineOfBusiness
      101     BloggsJ             1   Division1                 11  LineOfBusiness1
      102     BloggsJ             3   Division1                 12  LineOfBusiness2
      103     SmithF              2   Division2                 12  LineOfBusiness2

In this case the keys remain in a strictly hierachical order:

LineOfBusiness2 (Key=12) has two children: Division2 (Key=2) and Division1 (Key=3). Division1 (Key=3) has one child: BloggsJ (Key=102) and Division2 (Key=2) has one child: SmithF (Key=103).
LineOfBusiness1 (Key=11) has one child: Division1 (Key=1), which has one Child: BloggsJ (Key=101)

Displaying the name in the cube allows you to build a hierarchy that can support drill-down operations. You will also probably want to hide the base attributes for this hierarchy and display another set with just the names, so there is something in the dimension that will produce a clean, unique list of members at each level without opaque, confusing repeated names.

*****Update 05012012 3:22 PM CST

Here is an image of my data example.

enter image description here

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange