I'm looking to determine whether it is better from a performance and coding perspective to store two associated database records as a single row (and search both columns for a specific record since the value could be in either place) or create a second row for that association and only search one column.
An example will help hopefully:
UserTable
userID INTEGER,
firstName VARCHAR2(20),
lastName VARCHAR2(20)
2 rows:
1, John, Smith
2, Terry, Jenkins
Second table (to track relationship between the two)
RelationshipTable
relationshipID INTEGER,
userID1 INTEGER,
userID2 INTEGER
Now to store a relationship between john and terry I could do:
Option1 (1 row):
relationshipID, userID1, userID2
1, 1, 2
Then to look for any relationship that terry is a part of i would have to do something like
SELECT *
FROM RelationshipTable
WHERE userID1 = [terrysID] OR userID2 = [terrysID]
Or I could go with 2 rows and inserting each ID in the association into a specific column.
Option2 (2 rows):
relationshipID, userID1, userID2
1, 1, 2
2, 2, 1
and find any relationships that terry is a part of by:
SELECT *
FROM RelationshipTable
WHERE userID1 = [terrysID]
I'm not sure which is better.
I could setup indexes on both columns which would help with the first option. However, I would still have to do some results post-processing to determine which column in the resultset has the ID that is not terry's. And i think the coding is a bit messier since I'd have to repeat that logic in multiple places.
On the other-hand, the second approach effectively doubles the amount of data, and even scarier, duplicates data without adding any real "business value". So if that relationship ever ended I would have to ensure I deleted both records (or soft-deleted or whatever we chose to do).
I never know if I would be searching for John's relationship's or Terry's relationship's so I cannot intelligently insert either ID into a specific column at time of relationship creation.
Thoughts? There might be a third option that I haven't thought of that is the better? Something like creating a view on the table that creates the two rows for querying but without actually duplicating the data? Obviously that would create additional overhead on the system.
Edit:
This looks like a similar question, but I am not sure any answer accurately satisfies what I am looking for.
Two way relationships in SQL queries
Thanks!