Question

Question

What is the best practice for ensuring a Key from a "Grand-Parent" or "Great Grand-Parent" table is maintained when a "Child" or "Grand-Child" table is created from multiple relationship trees.

Details Since That Question Doesn't Likely Make Sense

I am attempting to build a database for keeping track of the execution status of automated processes running in our environment.

In general we have a "Job" which triggers one or more "Executables" and those "Executables" can run tasks for one or more customers. We will then have 2 logging tables, one that tracks when a "Job" was started, and another that will log the Success vs Failure status of each "ExecutableCustomer" instance.

A Planned Simplified Schema is below: ER Diagram

When we right the record to the JobExecutableCustomerExecutionLog, I would like to ensure that the Job.ID value associated with JobExecutionLog.ID value matches the Job.ID value associated with JobExecutableCustomer.ID.

Normally I would handle this with a Foreign Key but since Job.ID is not stored on JobExecutableCustomer, JobExecutableCustomerExecutionLog nor JobExecutionLog. The Relationship is indirect.

Example:

I have 2 jobs, "Send Email" and "Send Text Message". "Send Email" initiates a single executable which belongs to 1 Customer. "Send Text Message" has 2 executables (both of which execute for the same customer). I want to make sure that when the record is written to JobExecutableCustomerExecutionLog for "Send Email" the Job.ID associated with JobExecutableCustomerExecutionLog.JobExecutableCustomerID and JobExecutableCustomerExecutionLog.JobExecutionLogID (after walking the relationships up) actually belong to the Job.ID for "Send Email" not "Send Text Message".

As I see it I have 2 options:

  1. Push the value from Job.ID into all the child tables, and make it part of the Foreign Key
  2. Have another process (Trigger or Indexed View) ensure the relationships for me

I personally don't like the idea of pushing the Job.ID value on all the other child tables, so I am leaning towards using a Trigger or something else to handle it. I didn't know if those were my only two options or if I have the ability to configure a "normal" Foreign Key to traverse the relationships all the way up. In some kind of Cascade or something else.

Was it helpful?

Solution

I personally don't like the idea of pushing the Job.ID value on all the other child tables

Why not? The obvious solution is to make JobID the leading PK/clustered index column on all the child tables. And it ensures optimal performance for accessing all these tables by JobID.

Generally whenever you have an "identifying relationship" aka "child table" aka "weak entity" the child table should use a compound primary key whose leading columns are also the foreign key to the parent table. Something like this:

create table Parent
(
  ParentId int not null identity,
  constraint pk_Parent
    primary key(ParentId)
)

create table Child
(
  ParentId int not null, 
  ChildId int not null identity, 
  constraint pk_Child 
    primary key (ParentId,ChildId),
  constraint fk_Child_Parent
    foreign key (ParentId) references Parent(ParentId)
    on delete cascade
)
create table GrandChild
(
  ParentId int not null references Parent, 
  ChildId int not null, 
  GrandChildId int not null identity,
  constraint pk_GrandChild 
    primary key (ParentId,ChildId,GrandChildId),
  constraint fk_GrandChild_Child
    foreign key (ParentId) references Parent(ParentId)
    on delete cascade
)

OTHER TIPS

An example of how data integrity can be maintained using FKs. I have modified the model a bit to allow for an executable to be a part of more than one job. The assumption is that when an executable runs, it runs for all customers associated with that executable. Your ERD does not show constraints, so there is some guessing involved.


-- Customer CUS exists.
--
customer {CUS}
      PK {CUS}
-- Executable EXE exists.
--
executable {EXE}
        PK {EXE}
-- Executable EXE runs task for customer {CST}.
--
customer_exe {EXE, CST}
          PK {EXE, CST}

FK1 {EXE} REFERENCES executable {EXE}
FK2 {CUS} REFERENCES customer   {CUS}
-- Job JOB exists.
--
job {JOB}
 PK {JOB}
-- Step number STP# of job JOB
-- runs executable EXE.
--
job_step {JOB, STP#, EXE}
      PK {JOB, STP#}
      SK {JOB, STP#, EXE}

FK1 {JOB} REFERENCES job {JOB}
FK2 {EXE} REFERENCES executable {EXE}

An assumption is that tasks write to log_ tables, so some redundancy is expected: like EXE in log_job; still FKs prevent anomalies.

-- Executable EXE ran as step number STP#
-- of job JOB on DTE (date-time).
--
log_job {DTE_J, JOB, STP#, EXE}
     PK {DTE_J, JOB, STP#}
     SK {DTE_J, JOB, STP#, EXE}

      FK {JOB, STP#, EXE} REFERENCES
job_step {JOB, STP#, EXE}
-- Executable EXE ran for customer CUS,
-- with success result RES, on DTE_C (date-time),
-- as step number STP# of job JOB, which ran on DTE_J.
--
log_cus {DTE_J, JOB, STP#, CUS, EXE, DTE_C, RES}
     PK {DTE_J, JOB, STP#, CUS}

    FK1 {DTE_J, JOB, STP#, EXE} REFERENCES
log_job {DTE_J, JOB, STP#, EXE}

         FK2 {EXE, CST} REFERENCES
customer_exe {EXE, CST}

Note:

All attributes (columns) NOT NULL

PK = Primary Key
AK = Alternate Key   (Unique)
SK = Proper Superkey (Unique)
FK = Foreign Key
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top