First database with referential integrity constraints — suggestions, feedback, errors?

https://stackoverflow.com/questions/4346752

30-09-2019
|

Question

TARGET_RDBMS: MySQL-5.X-InnoDB ("X" equals current stable release)

BACKGROUND: Building my first database with true referential integrity constraints, in an effort to get feedback, after creating the "real" DDL, I've made an abstraction that I believe covers the "feel" of the database; this is only 3 tables of about 20, all with referential integrity constraints; only pattern I see that is missing is a composite key table, which does not have data to be dumped in right now anyway, so I'm just focus on the first iteration.

Sample Data / Unit Test: One thing I do not know is how to build out a sample data set that will offer 100% coverage of the referential integrity modeled -- AND build "Unit Test" around that sample data and this DDL:

Sample DLL:

(Note: Just to be clear, the LEGEND and naming standards are JUST for this example, which I've abstracted from the "real" database. The column names are robotic in nature, and meant to make the meaning and relationship of a given instance as clear as possible. If you have suggestions on the notation system used, please feel free to comment. I'm open to any suggestions. Thanks!)

CREATE DATABASE sampleDB;

use sampleDB;

# ###############
# LEGEND
# - sID = surrogate key
# - nID = natural key
# - cID = common/shared across tables, but NOT unique/natural-key
# - PK = Primary Key
# - FK = Foreign Key
# - data01 = Sample data (non-key,not-shared-across-tables)
# - data02 = Sample data NOT NULL (non-key,not-shared-across-tables)
#
# - uID = user defined unique/natural key (NOTE: not used)

# ###############
# Behavior
# - create_timestamp (NOT NULL, updated on record creation, NOT update)
# - update_timestamp (NOT NULL, updated on record creation AND updates)

CREATE TABLE `TABLE_01` (
  `TABLE_01_sID_PK` MEDIUMINT NOT NULL AUTO_INCREMENT,
  `TABLE_01_cID` int(8) NOT NULL,
  `TABLE_01_data01` varchar(128) default NULL,
  `TABLE_01_data02`  varchar(128) default NULL,
  `create_timestamp` DATETIME DEFAULT NULL,
  `update_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY  (`TABLE_01_sID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `TABLE_02` (
  `TABLE_02_sID_PK` MEDIUMINT NOT NULL AUTO_INCREMENT,
  `TABLE_02_nID_FK__TABLE_01_sID_PK` int(8) NOT NULL,
  `TABLE_02_cID` int(8) NOT NULL,
  `TABLE_02_data01` varchar(128) default NULL,
  `TABLE_02_data02` varchar(128) NOT NULL,
  `create_timestamp` DATETIME DEFAULT NULL,
  `update_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`TABLE_02_sID_PK`),
  FOREIGN KEY (TABLE_02_nID_FK__TABLE_01_sID_PK) REFERENCES TABLE_01(TABLE_01_sID_PK),
  INDEX `TABLE_02_nID_FK__TABLE_01_sID_PK` (`TABLE_02_nID_FK__TABLE_01_sID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `TABLE_03` (
  `TABLE_03_sID_PK` MEDIUMINT NOT NULL AUTO_INCREMENT,
  `TABLE_03_nID_FK__TABLE_01_sID_PK` int(8) NOT NULL,
  `TABLE_03_nID_FK__TABLE_02_sID_PK` int(8) NOT NULL,
  `TABLE_03_cID` int(8) NOT NULL,
  `TABLE_03_data01` varchar(128) default NULL,
  `TABLE_03_data02` varchar(128) NOT NULL,
  `create_timestamp` DATETIME DEFAULT NULL,
  `update_timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`TABLE_03_sID_PK`),
  FOREIGN KEY (TABLE_03_nID_FK__TABLE_01_sID_PK) REFERENCES TABLE_01(TABLE_01_sID_PK),
  FOREIGN KEY (TABLE_03_nID_FK__TABLE_02_sID_PK) REFERENCES TABLE_02(TABLE_02_sID_PK),
  INDEX `TABLE_03_nID_FK__TABLE_01_sID_PK` (`TABLE_03_nID_FK__TABLE_01_sID_PK`),
  INDEX `TABLE_03_nID_FK__TABLE_02_sID_PK` (`TABLE_03_nID_FK__TABLE_02_sID_PK`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

SHOW TABLES;

# DROP DATABASE `sampleDB`;

# #######################
# View table definition
# DESC inserttablename;

# #######################
# View table create statement
# SHOW CREATE TABLE example;

Questions:

Any and all feedback on missing, wrong, or "better" ways to do this database build are welcome. If you have questions, just comment -- and I'll respond ASAP. Again, thanks~!

UPDATE (1):

Just added "MEDIUMINT NOT NULL AUTO_INCREMENT" to the PKs -- not sure how I left that off.

Solution

First of all, I want to applaud you for defining a standard. There is no end to how much it will come to help you in the future.

Having said that, a couple of very subjective opinions from my part:

I don't like to embed type information in names, such as "TABLE_PERSON" or "PERSON_T" because it becomes confusing the second you replace a table with a view instead. At this point you could of course search and replace "PERSON_T" with "PERSON_VW" instead, but it kind of misses the point :) The same goes for columns (although i can't see this in your example). Think of the "n_is_dead" column that gets changed from numeric to varchar.

Can a row exist in a table without being created (create_timestamp)? Declare columns as NOT NULL if they really can't be null. In fact, I start of having NOT NULL on most of my columns because it makes me think harder about the nature of the data.

I'm a fan of naming the primary key column something other than ID. For example

company(company_id, etc)
person(person_id, company_id, firstname etc)

I've heard some people have problems with O/R mappers that want you to have the primary key named "ID" at all times, but I don't know if this is still true of if this has changed recently.

It's not clear to me if you intented to embed (s,n,c) in the column names to indicate whether they are surrogate, natural or common key. But I also don't think this is a good idea. I feel that would "reveal" some implementation detail that doesn't fit naturally in the logical model.

It looks like you are exposing/embedding the foreign key relationship in the column names. I have never thought of this, but I think you will deeply regret this one. If not only because it makes the column names unbearably uggly :)

When choosing a name for an index. The only time I regret naming an index something is when I look at an execution plan and see "index_01" being used. I always wish I had put the column name in the index to make it visible in the xplan. I don't know the limit for an index name, but I always run into the limit on Oracle. So, try to come up with some rule for how to abbreviate the table name. The column name is the important thing here.

Regarding mixed case. I always (no exceptions) go with either ALL_UPPER_CASE or all_lower_case. The reason is that in the past I've been burned when migrating queries between databases when they treat case differently. Lately, I use all_lower_case because the typical font of our editors makes it easier to spot spelling errors in lower case than in upper case. And when I fail at things, it doesn't seem like the editor is SHOUTING AT ME ;)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow