Question

Let's say we have a dimension that represents sales offices. The offices might move, which would be a type II change. We'd want to track operations that happened at the old office location, and operations that now happen at the new, and know when the change happened. So far, just standard type II design. Now let's say an office merges with another office. That is, the operational activity of two originally distinct offices (the "parent offices") is now taking place in a single office (the "merged office"), which might be a continuation (physically or in terms of staff) of either one of the original offices, or it might be a new office altogether that is, from a business point of view, a continuation of the previous two.

The reporting/analysis requirements are as follows:

  • We want to be able to see all the current activity for the new merged office.
  • We want to be able to see all the activity that's ever been done by the merged office or by the parent offices.
  • We want to be able to see over time activity that took place in one of the parent offices both before and after merger, without seeing activity from the other parent office (at least before the merger).

I'm not sure how to model this with any of the SCD types. If we simply replace the two parent office entries with a single new one, and update all the fact tables accordingly, we have a type I change. That allows us to see current activity just fine, but we lose history. If we keep the records separate, we won't know about the merger. If we add a third record to represent the merged office, we also lose history (which natural key would it have? neither of the parent offices' natural keys would be suitable).

Do I need to use a bridge/many-to-many table? That introduces complexity that I'd like to avoid. However, if that's the best way to do it, then so be it. I'm still not sure, however, how that would be structured. Perhaps the fact table would point to an office entry, and the offices would be grouped in a many-to-many fashion. Reporting would be done based on groups, rather than directly on the office dimension.

Answers to ElectricLlama's Questions

  • Most user interaction is via canned reports, so any complexity from the underlying structure will be hidden from them.
  • Some users do use SQL or SAS to get at the data. Right now, they are unlikely to care about this specific problem, but that may change as we bring more users on board with these tools.
  • We have no query writer at this juncture.
  • I don't think there will be multi-level mergers, but I can't definitively say no. I'd be surprised, though, if there were.
  • I don't know how to make this kind of thing easy for the end-user, which might be argument enough to relax some requirements.
Was it helpful?

Solution

I would prefer the simplest possible solution that the customer could accept, so I would do the following. I would provide two office field in the office dimension:

  1. Office_as_today
  2. Office_original

(of course you have to pick names that are good for your customer) At the start the two fields would be set equal. When two office merges I would go back in the two original offices and update the Office_as_today field, with the name of the merged office.

The new facts (from the merge on) will be registered with a new row with the two fields again equal.

The solution is very simple and fulfill almost all the requirement, with the exception of being unable to follow the original offices after the merge (here I underline your "at least").

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top