Question

Apologies for asking this like an interview question...

Say you have 3 tables A, B and C

A is huge

B references A’s primary key through a foreign key and is small

C wants to reference the data in B but doesn’t know how

Which of the following options seems better:

  • Add a column to C having a foreign key to B’s primary key which is a foreign key of A’s primary key
  • Add a column to C having a foreign key to A’s primary key directly
  • Something else...

The plan is for C to join with B using our new column, but which table should C reference?

My gut feeling is that on an insertion to C the constraint check will be faster if its referencing a smaller table but I'm not sure if that's how it actually plays out

Clarification: C is a subset of B and B is a subset of A and all proposed relations are 1:1

Était-ce utile?

La solution

If A is in one-to-zero-to-one relation with B, then B inherits its primary key from A:

enter image description here

If B is in one-to-zero-to-one relation with C, then C inherits its primary key from B, which is the primary key of A:

enter image description here

Nothing out of sort here, this is perfectly acceptable.

Autres conseils

It depends a lot on other things. Is the data in C always going to be a subset of B, or it is possible that it has another information? The second thing could show a bad design of the database.

By the way you are describing it, I think there is somewhat of a problem of a database not enough normalized (unless you add another column in B that receives the reference to C, not only to use the foreignt key directly).

By the way you are describing it, I can only imagine the data on C is going to have a less intense use (testing or so), so I think that your idea of making C reference B is the way to go. Hope my answer is clear enough.

There is no proper answer, not for such question.

1st solution requires pkey of B being fkey of A, which implies 1:1 relation - since you didn't mention such restriction, this obviously won't work for A:B being 1:many relation.

2nd solution is wrong as it doesn't represent the relation that is going to be established - you want to connect C with B, not C with A.

As for the other solutions, if you ask for speed, there are plenty, including triggers (depending on insert/update volume), index on some partition (note there is no such thing as partial fkey in Postgres), redesigning the tables and sanitizing the relations, using cstore for large data sets etc.

For example, you might want to consider intermediate table I between A and B (if this is 1:many relation), so that A:I is 1:1, I:B is 1:many and you connect C with I. This in turn won't guarantee that for each entry in C there is an entry in B (but this itself is hard for B being "many" part of any relation), but would make query checking for this as fast as C scan (not A).

And note, that fkey is not a "constraint check", but specific index lookup, so in C:B scenario you consider only pkey of B, being that fkey to A's pkey irrelevant.

Licencié sous: CC-BY-SA avec attribution
Non affilié à dba.stackexchange
scroll top