Question

I realise there might be similar questions but I couldn't find one that was close enough for guidance.

Given this spec,

Site
---------------------------
SiteID      int    identity
Name        varchar(50)

Series
---------------------
SiteID      int
SeriesCode  varchar(6)
...
--SeriesCode will be unique for every unique SiteID

Episode
----------------------
SiteID      int
SeriesCode  varchar(6)
EpisodeCode varchar(10)
...

my proposed design/implementation is

Site
----------------------------
SiteID      int     identity
Name        varchar(50)


Series
-------------------------------------------
SeriesID    int     identity, surrogate key
SiteID      int         natural key
SeriesCode  varchar(6)  natural key
UNIQUE(SiteID, SeriesCode)
...

Episode
-------------------------------------------
EpisodeID   int     identity, surrogate key
SeriesID    int     foreign key
EpisodeCode varchar(6)  natural key
...

Anything wrong with this? Is it okay to have the SeriesID surrogate as a foreign* key here? I'm not sure if I'm missing any obvious problems that can arise. Or would it be better to use composite natural keys (SiteID+SeriesCode / SiteID+EpisodeCode)? In essence that'd decouple the Episode table from the Series table and that doesn't sit right for me.

Worth adding is that SeriesCode looks like 'ABCD-1' and EpisodeCode like 'ABCD-1NMO9' in the raw input data that will populate these tables, so that's another thing that could be changed I suppose.

*: "virtual" foreign key, since it's been previously decided by the higher-ups we should not use actual foreign keys

Was it helpful?

Solution

Yes, it all looks fine. The only (minor) point I might make is that unless you have another 4th child table hanging off of Episode, you probably don't need EpisodeId, as Episode.EpisodeCode is a single attribute natural key sufficient to identify and locate rows in Episode. It's no harm to leave it there, of course, but as a general rule I add surrogate keys to act as targets for FKs in child tables, and try to add a narural key to every table to indentify and control redundant data rows... So if a table has no other table with a FK referencing it, (and never will) I sometimes don't bother including a surrogate key in it.

OTHER TIPS

What's a "virtual" foreign key? Did the higher-ups decide not to use foreign key constraints? In that case, you're not using foreign keys at all. You're just pretending to.

And is Episode the best choice for an entity? Doesn't it really mean Show or Podcast or so, and just happens to always be part of a series right now? If so, will that change in the future? Will Episode eventually be abused to encompass Show outside of a Series? In that case, tying Episode to Site via Series might come back to haunt you.

Given all that, and assuming that you as a grunt probably can't change any of it: if i was you i'd feel safer using natural keys wherever possible. In absence of foreign key constraints, it makes recognizing bad data easier, and if you have to resort to some SeriesCode='EMPTY' trickery later on that's easier with natural keys, too.

My suggestion:

Use natural/business as primary key whenever possible except in the following 3 situations:

  1. The natural/business key is unknown at the moment of inserting
  2. The natural/business key is not good ( it's not unique, it's liable to change frequently )
  3. The natural/business key is a composite of more than 3 columns and the table will have child tables

In situations 1 and 2 a surrogate key is requiered.

In situation 3 a surrogate key is strongly recommended.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top