Question

I am using MS SQL Server but in general database design I want to know what problems can arise when every row in a database has its auto generated surrogate key value.

I know some advantages, for example that there is no need to identify unique columns without NULLs for primary keys, and there is no need to manage composite primary keys, normal forms are easier to manage, and uniqueness is guaranteed.

I want to know, is there any good reason regarding to performance, or index structures, etc. that should make us use real world factful keys instead of surrogate ones?

Thanks.

Was it helpful?

Solution

There are many disadvantages to using surrogate keys, the most important ones IMHO are:

  1. Using natural keys significantly reduces the number of joins that your queries need to perform. With a surrogate key, you will always have to join back to the original table to get meaningful values.
  2. Surrogate keys 'abstract' the actual values users are looking for from the optimizer which, significantly hinders the usefulness of statistics.
  3. Natural keys enforce logical consistency and prevent duplicates. Surrogate keys are way too often abused as a magic solution, without regard to the candidate keys which in turn result in logical duplication of data.

While there are (rare) cases that justify the use of surrogate keys, the state-of-the-art is that 99.9% of the databases that i've seen just use this 'magic one-size-fits-all' solution, with detrimental impact on performance, modularity, and data consistency.

To learn some more about it, watch this session that I delivered on the subject at Silicon Valley Code Camp 2017.

OTHER TIPS

If you can use existing column(s) as a natural key then your table will have one less column. Therefore SQL Server will have one less column to read.

However, on the other hand, if the natural key is referenced as a foreign key by another table and it is something long like social security number, it would have to be duplicated in the referencing table which could take up more disk, and cause slower joins.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top