Question

Question

Is it a good rule of thumb for database IDs to be "meaningless?" Conversely, are there significant benefits from having IDs structured in a way where they can be recognized at a glance? What are the pros and cons?

Background

I just had a debate with my coworkers about the consistency of the IDs in our database. We have a data-driven application that leverages spring so that we rarely ever have to change code. That means, if there's a problem, a data change is usually the solution.

My argument was that by making IDs consistent and readable, we save ourselves significant time and headaches, long term. Once the IDs are set, they don't have to change often and if done right, future changes won't be difficult. My coworkers position was that IDs should never matter. Encoding information into the ID violates DB design policies and keeping them orderly requires extra work that, "we don't have time for." I can't find anything online to support either position. So I'm turning to all the gurus here at SA!

Example

Imagine this simplified list of database records representing food in a grocery store, the first set represents data that has meaning encoded in the IDs, while the second does not:


ID's with meaning:

Type
1 Fruit
2 Veggie

Product
101 Apple
102 Banana
103 Orange
201 Lettuce
202 Onion
203 Carrot

Location
41 Aisle four top shelf
42 Aisle four bottom shelf
51 Aisle five top shelf
52 Aisle five bottom shelf

ProductLocation
10141 Apple on aisle four top shelf
10241 Banana on aisle four top shelf
//just by reading the ids, it's easy to recongnize that these are both Fruit on Aisle 4

ID's without meaning:

Type
1 Fruit
2 Veggie

Product
1 Apple
2 Banana
3 Orange
4 Lettuce
5 Onion
6 Carrot

Location
1 Aisle four top shelf
2 Aisle four bottom shelf
3 Aisle five top shelf
4 Aisle five bottom shelf

ProductLocation
1 Apple on aisle four top shelf
2 Banana on aisle four top shelf
//given the IDs, it's harder to see that these are both fruit on aisle 4

Summary

What are the pros and cons of keeping IDs readable and consistent? Which approach do you generally prefer and why? Is there an accepted industry best-practice?

-------- edit ( helpful background info from comments, below ): --------

In our tables, the Primary Key is always an ID field containing a unique integer. At first, that integer was arbitrary. Over time, some of these IDs naturally took on meaning among developers/testers. During a recent refactor, certain developers also took time to make all IDs easier to recognize. It made everyone's job 100X easier. Some people (who don't actually use the data/code) vehemently disagreed for theoretical reasons. In practice, not one of those objections are holding true. Moreover, all developers using the data agree that it's now significantly easier to maintain.

I'm looking for (but haven't seen) a defensible argument against using immediately recognizable IDs in a data-centric environment.

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top