If all associated attributes of an intended primary key are identical, is it still a true primary key?
-
01-03-2021 - |
Вопрос
I'll start with an example. If I have a person
table with intended surrogate primary key Id
:
+----+------+------------+-------------+ | Id | name | DoB | SSN | +----+------+------------+-------------+ | 1 | John | 1901-01-01 | 111-11-1111 | | 2 | Jane | 1902-02-02 | 222-22-2222 | | 3 | John | 1901-01-01 | 111-11-1111 | +----+------+------------+-------------+
Note Id
s 1 & 3 have the same attributes; They both represent the same person.
Now from what we know about the theory behind what constitutes a primary key, which I think is well summarized here:
- The primary key must uniquely identify each record.
- A record’s primary-key value can’t be null.
- The primary key-value must exist when the record is created.
- The primary key must remain stable—you can’t change the primary-key field(s).
- The primary key must be compact and contain the fewest possible attributes.
Consider the first bullet, "The primary key must uniquely identify each record." In my example, I suppose whether or not each Id
does represent uniqueness depends on what's really supposed to be considered unique. A different database record? Yes. A different person (what the records are supposed to represent)? No.
So multiple Id
s represent what is functionally the same subject generating the data, present in 2 records. A "two to one Id" of sorts. I've not read anything that directly address the scenario my example illustrates, as is relates to what is or is not a PK.
- Does this example violate the theory behind what constitutes a primary key?
- If not, does this example illustrate a violation of any larger principles of database architecture, or can this concept be reduced to something as simple as "duplication of data - clean it up"?
Many thanks.
Решение
The problem with this model is that you did not identify all candidate keys in the relation and did not enforce uniqueness of those keys you did not identify. In reality {SSN}
and, possibly, {Name, DoB, SomethingElse}
would constitute additional candidate keys.
Nevertheless, Id
is still the primary key of this relation, but it does not identify the entity that you expect(ed) it to. It does not identify a "person", but something else, e.g. "an occurrence of somebody entering person's data".