Вопрос

Structure:

Actor <=== ActorMovie ===> Movie

ActorMovie: ActorID (fk), MovieId (fk)... ===> pk: (ActorID, MovieID)

Should do I create a surrogate key for ActorMovie table like this?

ActorMovie: ActorMovieID (pk), ActorID (fk), MovieId (fk)...
Это было полезно?

Решение

Conventions are good if they are helpful

"SQL Antipatterns", Chapter 4, "ID Required"

Intention of primary key

Primary key - is something that you can use to identify your row with it's unique address in table. That means, not only some surrogate column can be primary key. In fact, primary key should be:

  • Unique. identifier for each row. If it's compound, that means, every combination of column's values must be unique
  • Minimal. That means, it can't be reduced (i.e. if it's compound, no column could be omitted without losing uniqueness)
  • Single. No other primary key may be defined, each table can have only one primary key

Compound versus surrogate

There are cases, when surrogate key has benefits. Most common problem - if you have table with people names. Can combination of first_name + last_name + taxpayer_id be unique? In most cases - yes. But in theory, there could be cases, when duplicated will occur. So, this is the case, when surrogate key will provide unique identifying of rows in any case.

However, if we're talking about many-to-many link between tables, it's obvious, that linking table will always contain each pair once. In fact, you'll even need to check if duplicate does not exist before operating with that table (otherwise - it's redundant row, because it holds no additional information unless your design has a special intention to store that). Therefore, your combination of ActorID + MovieID satisfies all conditions for primary key, and there's no need to create surrogate key. You can do that, but that will have little sense (if not at all), because it will have no meaning rather than numbering rows. In other hand, with compound key, you'll have:

  • Unique check by design. Your rows will be unique, no duplicates for linking table will be allowed. And that has sense: because there's no need to create a link if it already exists
  • No redundant (and, thus, less comprehensive) column in design. That makes your design easier and more readable.

As a conclusion - yes, there are cases, when surrogate key should (or even must) be applied, but in your particular case it will definitely be antipattern - use compound key.

References:

Другие советы

Let me just mention a detail that seems to have been missed by other posters: InnoDB tables are clustered.

If you have just a primary key, your whole table will be represented by a lone B-Tree, which is very efficient. Adding a surrogate would just create another B-Tree (and "fatter" than expected to boot, due to how clustering works), without benefit to offset the added overhead.

Surrogates have their place, but junction tables are usually not it.

I'd always go with the composite key. My reasoning:

  • You will probably never use the surrogate key anywhere.
  • You will reduce the number of indexes/constraints on the table, as you will most certainly still need a indexes over actor and movie.
  • You will always search for either movie or an actor anyway.

Unless you have a scenario where you will actually use the surrogate key outside of it's own table, I'd go with the composite key.

If you want to associate other data elements with the join table, such as the name(s) of the role(s) played (which might be a child table) then I certainly would. If you were sure that you never wanted to then I'd consider it as optional.

Consider the first normal form (1NF) of database design normalization.

I would have made the ActorID and MovieID as unique key combination then create a primary key ActorMovieID.

See the same question here: Two foreign keys instead of primary

On this subject, my point is very simple: surrogate primary keys ALWAYS work, while Composite keys MIGHT NOT ALWAYS work one of these days, and this for multiple reasons.

So when you start asking yourself 'is composite better than surrogate', you have already entered the process of loosing your time. Go for surrogate. It allways works. And switch to next step.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top