Factoring out nulls in bill-of-materials style relations

https://stackoverflow.com/questions/368725

21-08-2019
|

Question

Given the schema

PERSON { name, spouse }

where PERSON.spouse is a foreign key to PERSON.name, NULLs will be necessary when a person is unmarried or we don't have any info.

Going with the argument against nulls, how do you avoid them in this case?

I have an alternate schema

PERSON { name }
SPOUSE { name1, name2 }

where SPOUSE.name* are FKs to PERSON. The problem I see here is that there is no way to ensure someone has only one spouse (even with all possible UNIQUE constraints, it would be possible to have two spouses).

What's the best way to factor out nulls in bill-of-materials style relations?

Solution

All right, use Auto-IDs and then use a Check Constraint. The "Name1" column (which would only be an int ID) will be force to only have ODD numbered IDs and Name2 will only have EVEN.

Then create a Unique Constraint for Column1 and Column2.

OTHER TIPS

I think that enforcing no NULLs and no duplicates for this type of relationship makes the schema definition way more complicated than it really needs to be. Even if you allow nulls, it would still be possible for a person to have more than one spouse, or to have conflicting records e.g:

PERSON { A, B }
PERSON { B, C }
PERSON { C, NULL }

You'd need to introduce more data, like gender (or "spouse-numbers" for same-sex marriages?) to ensure that, for example, only Persons of one type are allowed to have a Spouse. The other Person's spouse would be determined by the first person's record. E.g.:

PERSON { A, FEMALE, B }
PERSON { B, MALE, NULL }
PERSON { C, FEMALE, NULL }

... So that only PERSONs who are FEMALE can have a non-null SPOUSE.

But IMHO, that's overcomplicated and non-intuitive even with NULLs. Without NULLs, it's even worse. I would avoid making schema restrictions like this unless you literally have no choice.

Well, first I would use auto-incrementing IDs as, of course, someone could have the same name. But, I assume you intend to do that and won't harp on it. However, how does the argument against NULLs go exactly? I don't have any problem with NULLs and think that is the appropriate solution to this problem.

I'm not sure why no one has pointed this out yet, but it's actually quite easy to ensure that a person has only one spouse, using pretty much the same model that you have in your question.

I'm going to ignore for the moment the use of a name as a primary key (it can change and duplicates are fairly common, so it's a poor choice) and I'm also going to leave out the possible need for historical tracking (you might want to add an effective date of some sort so that you know WHEN they were a spouse - Joe Celko has written some good stuff on temporal modeling, but I don't recall which book it was in at the moment). Otherwise if I got divorced and remarried you would lose that I had another spouse at another time - maybe that isn't important to you though.

Also, you might want to break up name into first_name, middle_name, last_name, prefix, suffix, etc.

Given those caveats...

CREATE TABLE People
(
     person_name     VARCHAR(100),
     CONSTRAINT PK_People PRIMARY KEY (person_name)
)
GO
CREATE TABLE Spouses
(
     person_name     VARCHAR(100),
     spouse_name     VARCHAR(100),
     CONSTRAINT PK_Spouses PRIMARY KEY (person_name),
     CONSTRAINT FK_Spouses_People FOREIGN KEY (person_name) REFERENCES People (person_name)
)
GO

If you wanted to have spouses appear in the People table as well then you could add an FK for that as well. However, at that point you're dealing with a bidirectional link, which becomes a bit more complex.

Well, begin with using a key other than name, perhaps a int seed. But to prevent someone from having more than one spouse, simply add a unique index to the parent(name1) in the spouse table. that will prevent you from ever inserting the same name1 twice.

You need a person TABLE and a separate "Partner_Off" table to define the relationship.

Person (id, name, etc );

Partner_Off (id, partner_id, relationship);

To deal with the more complex social situation you probaly would probably need some dates in there, plus, to simplify the sqls you need one entry for (fred,wilma,husband) and a matching entry for (wilma,fred,wife).

You can use a trigger to enforce the constraint. PostgreSQL has constraint triggers, a particularly nice way to defer the constraint evaluation until the appropriate time in the transaction.

From Fabian Pascal's Practical Issues in Database Management, pp. 66-67:

Stored procedures—whether triggered or not—are preferable to application level integrity code, but they are practically inferior to and riskier than declarative support because they are more burdensome to write, error prone, and cannot benefit from full DBMS optimization.

...

Choose DBMSs with better declarative integrity support. Given the considerable gaps in such support by products, knowledgeable users would be at least in a position to emulate correctly—albeit with procedural and/or application code—constraints not supported by the DBMS.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow