Same Join Constraints on Multiple Columns

https://dba.stackexchange.com/questions/281022

11-03-2021
|

Question

We have a situation that I have been able to recreate with the following simple example. I have the following two sample tables:

CREATE TABLE contact_info
(
  id INT UNSIGNED AUTO_INCREMENT,
  priContactId INT,
  secContactId INT,
  blahBlah VARCHAR(32),

  PRIMARY KEY(id)  
);

and

CREATE TABLE name_lookup
(
  id INT UNSIGNED AUTO_INCREMENT,
  contactID INT,
  contactName VARCHAR(32),

  PRIMARY KEY(id)  
);

I populate them as follows:

INSERT INTO contact_info(priContactId, secContactId, blahBlah) VALUES(1, 3, "Team A"), (4, 2, "Team B");
INSERT INTO name_lookup(contactID, contactName) VALUES(1, "John Doe"), (2, "Mary Smith"), (3, "Jose Garcia"), (4, "Larry Brown");

Obviously, the contents of the tables are as follows:

+----+--------------+--------------+----------+
| id | priContactId | secContactId | blahBlah |
+----+--------------+--------------+----------+
|  1 |            1 |            3 | Team A   |
|  2 |            4 |            2 | Team B   |
+----+--------------+--------------+----------+

+----+-----------+-------------+
| id | contactID | contactName |
+----+-----------+-------------+
|  1 |         1 | John Doe    |
|  2 |         2 | Mary Smith  |
|  3 |         3 | Jose Garcia |
|  4 |         4 | Larry Brown |
+----+-----------+-------------+

We would like to perform a JOIN operation so that we get output like this:

+-------------+-------------+--------+
| John Doe    | Jose Garcia | Team A |
+-------------+-------------+--------+
| Larry Brown | Mary Smith  | Team B |
+-------------+-------------+--------+

The join constraints for both the priContactId and secContactId columns are the same and I am having tough time figuring out what the JOIN query should look like.

FYI, we are using MySQL version 5.6.49.

Solution

This is an interesting example of where an (indirect) SELF-JOINs prove useful!

In order to answer your problem, I did the following (all the SQL below is available on the fiddle here):

I used your DDL and DML as supplied in the question - thanks for that (and a +1 - you've only asked two questions, so I'd consider you a New Contributor and it's great to see that some people take the trouble to supply DDL and DML - if only all OP's did the same!).

CREATE TABLE name_lookup
(
  id INT UNSIGNED AUTO_INCREMENT,
  contact_id INT,
  contact_name VARCHAR(32),

  PRIMARY KEY(id)  
);



CREATE TABLE contact_info
(
  id INT UNSIGNED AUTO_INCREMENT,
  pri_contact_id INT,
  sec_contact_id INT,
  blah_blah VARCHAR(32),

  PRIMARY KEY(id)  
);

Populate them:

INSERT INTO name_lookup(contact_id, contact_name) 
VALUES
(1, "John Doe"), (2, "Mary Smith"), 
(3, "Jose Garcia"), (4, "Larry Brown");

and

INSERT INTO contact_info(pri_contact_id, sec_contact_id, blah_blah) 
VALUES(1, 3, "Team A"), (4, 2, "Team B"), (1, NULL, "Team A");

Notice the final record which has sec_contact_id = NULL - see the discussion in the EDIT below. I'm implicitly assuming that the team identity is determined by the pri_contact_id - define as necessary.

You'll also notice that I use snake_case or lower_case_with_underscores - personal preference - pick a style and stick to it!.

I've left extra fields in the SQL in the fiddle so that you can see the thought processes involved and how I arrived at a solution!

SELECT 
  nl1.id, nl1.contact_id, nl1.contact_name, 
  ci1.pri_contact_id, ci1.sec_contact_id, ci1.blah_blah
FROM name_lookup nl1
JOIN contact_info ci1 
  ON nl1.contact_id = ci1.pri_contact_id;

Result:

id  contact_id  contact_name    pri_contact_id  sec_contact_id  blah_blah
 1           1      John Doe                 1               3     Team A
 4           4   Larry Brown                 4               2     Team B

So, now we join name_lookup back on itself using contact_info as the link between them.

SELECT 
  nl1.id, nl1.contact_id, nl1.contact_name,
  nl2.id, nl2.contact_id, nl2.contact_name,
  ci1.pri_contact_id, ci1.sec_contact_id, ci1.blah_blah
FROM name_lookup nl1
JOIN contact_info ci1 
  ON nl1.contact_id = ci1.pri_contact_id
JOIN name_lookup nl2
  ON ci1.sec_contact_id = nl2.contact_id
ORDER BY nl1.id;

Result:

id  contact_id  contact_name    id  contact_id  contact_name    pri_contact_id  sec_contact_id  blah_blah
 1           1      John Doe     3           3   Jose Garcia                 1   3    Team A
 4           4   Larry Brown     2           2    Mary Smith                 4 2      Team B

So, having obtained our result, we can now clean up the SQL (only SELECT required fields - reduce any network traffic as well as I/O on the server) as follows:

SELECT 
  nl1.contact_name AS "Con_1 name",
  nl2.contact_name AS "Con_2 name",
  ci1.blah_blah AS "Team"
FROM name_lookup nl1
JOIN contact_info ci1 
  ON nl1.contact_id = ci1.pri_contact_id
JOIN name_lookup nl2
  ON ci1.sec_contact_id = nl2.contact_id
ORDER BY nl1.id;

Result:

Con_1 name     Con_2 name     Team
  John Doe     Jose Garcia  Team A
Larry Brown    Mary Smith   Team B

Et voilà - result as desired!

`EDIT (NULLs in sec_contact_id):`

It was pointed out to me that my answer wasn't as comprehensive as I might have liked it to be. What if sec_contact_id is NULL which, after all, is possible - you might have made the first one but a follow-up hasn't been completed yet?

So, I changed the table slightly and it now contains (as you'll see above - a considerably changed fiddle is available here - I wanted to run it on PostgreSQL also):

 Con_1 name   Con_2 name      Team
   John Doe  Jose Garcia    Team A
   John Doe         NULL    Team B
Larry Brown   Mary Smith    Team B

so, now, you have to use INNER JOINs thus:

  nl1.contact_name AS "Con_1 name",
  nl2.contact_name AS "Con_2 name",
  ci1.blah_blah AS "Team"
FROM contact_info ci1
    LEFT JOIN name_lookup nl1
        ON nl1.contact_id = ci1.pri_contact_id
    LEFT JOIN name_lookup nl2
        ON nl2.contact_id = ci1.sec_contact_id
ORDER BY nl1.id, ci1.blah_blah;

Result:

  Con_1 name      Con_2 name      Team
    John Doe     Jose Garcia    Team A
    John Doe            NULL    Team B
 Larry Brown      Mary Smith    Team B

So, now the record with the sec_contact_id = NULL datum appears in your resultset.

A couple of words of advice:

You really should consider (strongly) consider upgrading from 5.6 to the current version of MySQL 8 - you'll get window functions, generated columns, CHECK constraints - it's at version 22 now, and I haven't heard many complaints, so it'd be a good option!
Many people consider NULLs to be undesirable and avoid them like the plague - I tend to fall into this category. Therefore, you might wish to consider having two contact tables - pri_contact_info and sec_contact_info. You can decide if you wish to do that and ask a new question - if you do so, let me know here also!
You might consider answering this question (a comment below your original post): Can there ever be more than two people on a team? I e. There would be 3 columns with team members names for the same row?!
the PRIMARY KEY of the name_lookup table should be contact_id - the surrogate key id adds nothing to the mix! I assume that contact_id is assigned more or less randomly to employees as they are hired? So, it's essentially a surrogate key itself. Surrogate keys have their place, but sometimes they're not the way to go!
MySQL is by far and away the worst of the major RDBMS's when it comes to standards support, plus it has a plethora of non-standard "extensions" - in future, maybe you could delimit your SQL strings with apostrophes ' and not double quotes "? My personal suggestion is that you use double quotes for aliases of field names for the final result (presentation, not content) as I have done - it will make your SQL more readable and portable!

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange