Question

I want to handle a date dimension in a MySQL datawarehouse. (I m a newbie in the DW world)

I made some searches with google and saw a lot of table structures (most of) date dimension where the Primary Key is a simple UNSIGNED INTEGER.

Why don't use a DATE field as primary key since with MySQL it is 3 Bytes VS 4 Bytes for INTEGER?

Ex:

CREATE TABLE dimDate
id INTEGER UNSIGNED NOT NULL PRIMARY AUTOI_NCREMENT,
date DATE NOT NULL,
dayOfWeek
...

VS

CREATE TABLE dimDate
date DATE NOT NULL PRIMARY,
dayOfWeek
...
Was it helpful?

Solution

If you have a table with a column that is of date type and where no two rows will ever have the same date, then you can surely use this column as PRIMARY KEY.

You see a lot of examples where the Primary Key is a simple UNSIGNED INTEGER because there are many cases where there is no perfect candidate for Primary Key. The AUTO_INCREMENT allows this column to be automatically filled by the database (and be unique) during inserts .

OTHER TIPS

Date dimension is kind of special -- having date (2011-12-07) or date-related integer (20111207) for a primary key is actually preferred. This allows for nice partitioning (by date) of fact tables.

For other type of dimensions, surrogate (integer) keys are recommended.

As a template, each dimension usually has entries for unknown, not entered, error, ... which are often matched with keys 0, -1, -2, ...

Due to this, it is more common to find integer-formatted date (20111207) as a primary key instead of date -- it is a bit messy to represent unknown, not entered, error, ... with date-type key.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top