Вопрос

I'll start with an example. If I have a person table with intended surrogate primary key Id:

+----+------+------------+-------------+
| Id | name |    DoB     |     SSN     |
+----+------+------------+-------------+
|  1 | John | 1901-01-01 | 111-11-1111 |
|  2 | Jane | 1902-02-02 | 222-22-2222 |
|  3 | John | 1901-01-01 | 111-11-1111 |
+----+------+------------+-------------+

Note Ids 1 & 3 have the same attributes; They both represent the same person.

Now from what we know about the theory behind what constitutes a primary key, which I think is well summarized here:

  • The primary key must uniquely identify each record.
  • A record’s primary-key value can’t be null.
  • The primary key-value must exist when the record is created.
  • The primary key must remain stable—you can’t change the primary-key field(s).
  • The primary key must be compact and contain the fewest possible attributes.

Consider the first bullet, "The primary key must uniquely identify each record." In my example, I suppose whether or not each Id does represent uniqueness depends on what's really supposed to be considered unique. A different database record? Yes. A different person (what the records are supposed to represent)? No.

So multiple Ids represent what is functionally the same subject generating the data, present in 2 records. A "two to one Id" of sorts. I've not read anything that directly address the scenario my example illustrates, as is relates to what is or is not a PK.

  1. Does this example violate the theory behind what constitutes a primary key?
    1. If not, does this example illustrate a violation of any larger principles of database architecture, or can this concept be reduced to something as simple as "duplication of data - clean it up"?

Many thanks.

Это было полезно?

Решение

The problem with this model is that you did not identify all candidate keys in the relation and did not enforce uniqueness of those keys you did not identify. In reality {SSN} and, possibly, {Name, DoB, SomethingElse} would constitute additional candidate keys.

Nevertheless, Id is still the primary key of this relation, but it does not identify the entity that you expect(ed) it to. It does not identify a "person", but something else, e.g. "an occurrence of somebody entering person's data".

Лицензировано под: CC-BY-SA с атрибуция
Не связан с dba.stackexchange
scroll top