Alternatives to storing a record with exactly n multiple foreign keys from the same foreign table, where the relationships can't be repeated

https://dba.stackexchange.com/questions/261901

26-02-2021
|

Pregunta

Say there are entities called singulars, and entities called relationships.

It takes exactly two singulars to make up a relationships entity.

That pair of singulars can't be repeated elsewhere in relationships, in any order.

One way to model it could be this way:

+----------------+
|relationships   |         +----------+
+----------------+         |singulars |
|id              |         +----------+
|singular_id_1   <---------+id        |
|singular_id_2   <---+     |attribute1|
|pair_description|         |attribute2|
|pair_date       |         |          |
|                |         +----------+
+----------------+

With this pattern, it becomes necessary to check both foreign key fields in relationships for the existence of singulars, which could be on either side. The order doesn't matter, yet it is defined in the schema... So the queries end up with a number of AND/OR groups and cases.

Expanding on that approach could be to store two records for every pair, with the singular_id_[n] swapped on both sides. While that solves some querying complexities, it would introduce additional complexities to make it infeasible.

Using an intermediate table seems like one potential solution:

+----------------+       +-----------------------+
|relationships   |       |singulars_relationships|        +----------+
+----------------+       +-----------------------+        |singulars |
|id              <-------+relationship_id        |        +----------+
|pair_description|       |singular_id            +-------->id        |
|pair_date       |       |                       |        |attribute1|
|                |       +-----------------------+        |attribute2|
+----------------+                                        |          |
                                                          +----------+

So the records might end up something like this:

+----------------------------------+
|relationships                     |
+----------------------------------+
|id   pair_description   pair_date |
+----------------------------------+
|1    Fizz buzz blitz    2022-02-20|
|2    Blitz buzz fizz    2022-02-22|
+----------------------------------+

+----------------------------------+
|singulars_relationships           |
+----------------------------------+
|relationship_id   singular_id     |
+----------------------------------+
|1                 1               |
|1                 2               |
|2                 3               |
|2                 4               |
+----------------------------------+

+-----------------------------+
|singulars                    |
+-----------------------------+
|id   attribute1   attribute2 |
+-----------------------------+
|1    Fizz         Blitz      |
|2    Buzz         Foo        |
|3    Bar          World      |
|4    Blorg        Hello      |
+-----------------------------+

There, singulars_relationships is where the pairs are defined. If a singular_id exists in there, it is already in a pair. One problem that may arise with this pattern could be that three or more singular_id s could end up associated with a relationship_id, and the "exactly n" constraint would then be compromised.

Are there official terms for this type of scenario? And other theory and alternatives?

Solución

Are there official terms for this type of scenario?

Yes. This is a Symmetric Relation. And "relation" here has the same meaning as in "Relational Database". An RDBMS is a database management system designed around storing relations. However RDBMSs don't have a native way to store symmetric relations. You either have to store both tuples, eg (a,b) and (b,a) as separate rows, or you have to use some sort of convention to store only one tuple. A common approach is to use a check constraint on the FKs.

check (singular_id_1 < singular_id_2)

Assuming the relation is anti-reflexive.

Otros consejos

To long for a comment, so I'll just add this as a complement to David's solution. Even though the question was of theoretical nature, it may be of interest to see how that can be implemented in real life.

In some situations, the anti-reflexive property, and the symmetric property is too strong, but we still want the "uniqueness" property to hold. One common pattern is to use BEFORE TRIGGERS to make the relation symmetric without letting the users having to be aware of the symmetric property:

CREATE TRIGGER trigger1
BEFORE INSERT ON t
REFERENCING NEW AS N
FOR EACH ROW
    SET (N.a, N.b) = (LEAST(N.a, N.b), GREATEST(N.a, N.b));

Properties in table:

CHECK(a<=b);
UNIQUE(a,b);

A variation of this theme is to use generated columns

CREATE TABLE t 
( a int not null
, b int not null
, least_a_b generated always as ( least(a,b) )
, greatest_a_b generated always as ( greatest(a,b))
, unique (least_a_b, greatest_a_b)
);

The generated columns may also be hidden:

least_a_b generated always as ( least(a,b) ) IMPLICITLY HIDDEN

Examples are from Db2, but similar functionality should exist for other vendors.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a dba.stackexchange