Is there an elegant way to store a dual relationship (i.e. user 1 and user 2 are friends)

https://stackoverflow.com/questions/8808994

26-10-2019
|

Pergunta

I've run into the same problem in two different pieces of work this month:

Version 1: User 1 & User 2 are friends
Version 2: Axis 1 & Axis 2 when graphed should have the quadrants colored...

The problem is, I don't see an elegant way, using a RDBMS, to store and query this information.

There are two obvious approaches:

Approach 1:

store the information twice (i.e. two db rows rows per relationship):
u1, u2, true 
u2, u1, true
u..n, u..i, true
u..i, u..n, true

have rules to always look for the inverse on updates: 
on read, no management needed
on create, create inverse
on delete, delete inverse
on update, update inverse

Advantage:    management logic is always the same.
Disadvantage: possibility of race conditions, extra storage (which is admittedly cheap, but feels wrong)

Approach 2:

store the information once (i.e. one db row per relationship)
u1, u2, true
u..n, u..i, true

have rules to check for corollaries:
on read, if u1, u2 fails, check for u2, u1 
on create u1, u2: check for u2, u1, if it doesn't exist, create u1, u2
on delete, no management needed
on update, optionally redo same check as create

Advantage: Only store once
Disadvantage: Management requires different set of cleanup depending on the operation

I'm wondering if there's a 3rd approach that goes along the lines of "key using f(x,y) where f(x,y) is unique for every x,y combination and where f(x,y) === f(y,x)"

My gut tells me that there should be some combination of bitwise operations that can fulfill these requirements. Something like a two-column:

key1 = x && y key2 = x + y

I'm hoping that people who spent more time in the math department, and less time in the sociology department have seen a proof of the possibility or impossibility of this and can provide a quick "[You moron,] its easily proven (im)possible, see this link" (name calling optional)

Any other elegant approach would also be very welcome.

Thanks

Solução

There is also a way to use the 2nd approach by adding an extra constraint. Check that u1 < u2:

CREATE TABLE User
( Name VARCHAR(10) NOT NULL
, PRIMARY KEY (Name)
) ;

CREATE TABLE MutualFriendship
( u1 VARCHAR(10) NOT NULL
, u2 VARCHAR(10) NOT NULL
, PRIMARY KEY (u1, u2)
, FOREIGN KEY (u1) 
    REFERENCES User(Name)
, FOREIGN KEY (u2) 
    REFERENCES User(Name)
, CHECK (u1 < u2) 
) ;

The rules to read, create, insert or update will have to use the (LEAST(u1,u2), GREATEST(u1,u2)).

Outras dicas

In SQL it's easy to implement the constraints to support your first approach:

CREATE TABLE MutualFriendship
(u1 VARCHAR(10) NOT NULL,
 u2 VARCHAR(10) NOT NULL,
 PRIMARY KEY (u1,u2),
 FOREIGN KEY (u2,u1) REFERENCES MutualFriendship (u1,u2));

INSERT INTO MutualFriendship VALUES
('Alice','Bob'),
('Bob','Alice');

For anyone that's interested, I played around with a few bitwise operations and found that the following seems to fulfill the criteria for f(x,y):

#Python, returns 3 tuple
def get_hash(x, y):
  return (x & y, x | y, x * y)

I can't prove it, though.

"x is a friend of y".

Define a table of (x,y) pairs and enforce a canonical form, e.g. x<y. This will ensure that you cannot have both (p,q) and (q,p) in your database, thus it will ensure "store once".

Create a view as SELECT x,y FROM FRIENDS UNION SELECT x as y, y as x FROM FRIENDS.

Do your updates against the base table (downside : updaters must be aware of the enforced canonical form), do your queries against the view.

You seem to limit the number of friends to 1. If this is the case then I would use something like u1,u2 u2,u1 u3,null u4,u5 u5,u4

u3 does not have a friend.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow