Question

We have a huge traffic database (call data records). Due to amount of records and hardware limitations we cant’ use source detail records for reporting. So we summarize records in relational database using 15 minutes intervals and then truncate source data immediately. These summarized records (events over time period) are stored in warehouse and then used for the fact table (MOLAP). Time spans/periods are stored in SYS_TIME_SLICES dimension table, which simplified versions is:

CREATE TABLE [dbo].[SYS_TIME_SLICES] (
  [ID] [int] IDENTITY(1, 1) NOT NULL,
  [DATETIME_START] [datetime] NOT NULL,
  [DATETIME_END] [datetime] NOT NULL
) ON [PRIMARY]

First records are:

DATETIME_START      DATETIME_END
01-Jan-13 00:00:00  01-Jan-13 00:15:00
01-Jan-13 00:15:00  01-Jan-13 00:30:00

Now we are putting this into the cube and I am not sure how to do it according to the best practices. I have checked “Step-by-Step” book, some internet tutorials on time dimension and related BI, still no clue. Marking TIME_SLICES dimension as time produced some weird results. Marking DATETIME_END as DateEnded type produced even more strange results.

I am relatively new to SSAS but have 15 years of experience dealing with SQL and customers reports, so I do understand what I want and how to do it in regular SQL.

We have to provide reports with Hour, Day, Week and Month granularity. This can easily be done with just manual modification and configuration of attributes for TIME_SLICES dimension with some hierarchy involved (w/o any SSAS special magic). But I would like to have all the bells & whistles of TIME business intelligence (Day Over Day Growth and so on).

More important is that old data (6+ months) is never updated and its source warehouse table is being archived. In cube we need old data with By Day level of detail only thus saving space on server. This is somehow achieved with partitions – but I am not sure about TIME specifics.

Given the considerations above is there a recommended/common approach for this TIME SPAN/PERIOD dimension? Any hints? Any books on subject? Should we change our warehouse logic?

This is related to SSAS - Facts that happened over a time range, but there is a bit different question. We are using 2008r2 but can upgrade to 2012 as needed.

Était-ce utile?

La solution

In most cases, it is better to use two separate dimensions for date and time. time in your case would have just 96 (24 * 4) members. These could be aggregated to hours, and depending on reporting/user needs to other time ranges like shift names. I would use an integer key for this, maybe a "speaking key" like 1415 for the time span from 2:15 pm to 2:30 pm (i. e. hour of start * 100 + minutes of start). This would make it easy to calculate the foreign key in the fact table from the DATETIME_START column. But you can also assign meaningless integer surrogate keys to each record.

The date dimension would contain days as the granularity, and could have attributes like day of week holiday yes/no, month, year, quarter, ... That very much depends on reporting necessities. Better prepare more than fewer attributes here, as this makes life for reporting easier. And the dimension table has few records, so there is no need to restrict the number of attributes too much. You can use a similar structure for the integer primary key of this table, e. g. 20131009 for October, 9, 2013. But again, just an arbitrary integer surrogate key would do as well.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top