Question

This is a really basic question, but I can't find a solid answer to it. Can I have values in my Dimension table which are not in the Fact table? I realize that the opposite direction is aproblem. I cannot have a Dimension Key in my Fact Table which does not exist in my Dimension Table, but what about the other way around?

I have a Customer Table which contains all of my customers. I then have an Orders Fact Table which includes Customer Id's, however, not all Customer's have ever ordered something so the Orders Fact does not contain a Customer ID for every Customer in the Customer Table.

This seems like a reasonable situation, but I have run into issues of Keys not Found when processing my cubes where nothing appears to resolve the issue other than using a Named Query for my Dimension where I specifically filter out any Customers that do not have any Orders. This resolves the error, but I'd rather not have to do this if I don't have to. Maybe there really is another underlying issue with my Key Not Found errors.

So, I was hoping someone could definitively tell me whether my scenario should work. Can I have more records in my Dimension Table than are in my Fact Table? If so, then I will spend more time trying to figure out the error. If not, I will resign myself to creating multiple views of my Customer table for each Fact that I need to use it with.

Thank you

Was it helpful?

Solution

I would say yes...there is little 'harm' in the setup, at most you are storing a few more bytes than you might need to. In this case here, having customers that have yet to order anything in a dimension table is not going to harm anything and quite possibly a nessacary step as a customer goes from created yet to order to created and ordered.

Usually the Dimension table tends to be derived from the fact table, and how you would have values in the dimension table derived from the fact table that isn't in the fact table is a bit confusing...but I can see it in your setup

I often find 'archived' dimension values can hang around long after all the fact records referring to it have been repointed.

Seems like a no harm no foul situation to me...

OTHER TIPS

Yes, of course.

Generally speaking you want "conformed dimensions", ie dimensions you can share across fact tables.

Say your customer_orders_product fact table uses the calendar dimension from Jan 1, 2010 to Dec 31, 2013.

But now you add a new fact table, warehouse_receives_shipment, and that data goes back to 2005.

You wouldn't want two calendar dimension tables.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top