質問

Quick description of the problem: I have create a standard start schema with 6 dimension tables and a single fact table. I have a need to add either one additional dimension table or an additional column to the fact table. However, unlike the other dimensions in the star schema, the final dimension I would like to add is one that will always be included in every query I make to the database. I am not sure where to design it.

Long Description:

I am creating star schema's to represent some very specific Google Analytics Queries. In one such schema, I have the following:

Fact: PageTrafficFact

Dimensions:

  • HostnameDim
  • PagePathDim
  • MediumDim
  • DateDim
  • LandingPagePathDim
  • ExitPagePathDim

I need to add either a column to the PageTrafficFact table or an additional dimension to represent the Google Analytics View Profile ID (GAVPID as I call it) of the corresponding data in the PageTrafficFact table. Whereas all of the other dimensions can be queries against interchangeably, 99.9% of the time, all queries issued to the database will be specific to a single profile GAVPID.

While I could make the GAVPID a dimension table, I also do not foresee a need to use it as such. The cost of making an extra inner-join on every single query seems excessive. An alternative that I thought of would be to place the GAVPID on the PageTrafficFact table itself. Then, rather than inner-joining on each query, I could perform a more simple WHERE selection of the exact GAVPID I was looking for.

Unfortunately, I do not have the experience to determine which would be best and my searching on Google has been difficult because I am not quite sure what keywords I should be using to find an answer.

Any help or recommended resources would be greatly appreciated!

役に立ちましたか?

解決

If there's no need to generate "zero counts" for the new dimension (that is, the data in the fact table isn't sparse in that dimension, and there's no need to "rollup" that dimension, then a separate dimension table isn't strictly necessary.

If adding a WHERE clause on an additional column in the fact table satisfies the known and anticipated requirements, I would just add the column to the fact table.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top