Same Join Constraints on Multiple Columns
Question
We have a situation that I have been able to recreate with the following simple example. I have the following two sample tables:
CREATE TABLE contact_info
(
id INT UNSIGNED AUTO_INCREMENT,
priContactId INT,
secContactId INT,
blahBlah VARCHAR(32),
PRIMARY KEY(id)
);
and
CREATE TABLE name_lookup
(
id INT UNSIGNED AUTO_INCREMENT,
contactID INT,
contactName VARCHAR(32),
PRIMARY KEY(id)
);
I populate them as follows:
INSERT INTO contact_info(priContactId, secContactId, blahBlah) VALUES(1, 3, "Team A"), (4, 2, "Team B");
INSERT INTO name_lookup(contactID, contactName) VALUES(1, "John Doe"), (2, "Mary Smith"), (3, "Jose Garcia"), (4, "Larry Brown");
Obviously, the contents of the tables are as follows:
+----+--------------+--------------+----------+
| id | priContactId | secContactId | blahBlah |
+----+--------------+--------------+----------+
| 1 | 1 | 3 | Team A |
| 2 | 4 | 2 | Team B |
+----+--------------+--------------+----------+
+----+-----------+-------------+
| id | contactID | contactName |
+----+-----------+-------------+
| 1 | 1 | John Doe |
| 2 | 2 | Mary Smith |
| 3 | 3 | Jose Garcia |
| 4 | 4 | Larry Brown |
+----+-----------+-------------+
We would like to perform a JOIN operation so that we get output like this:
+-------------+-------------+--------+
| John Doe | Jose Garcia | Team A |
+-------------+-------------+--------+
| Larry Brown | Mary Smith | Team B |
+-------------+-------------+--------+
The join constraints for both the priContactId
and secContactId
columns are the same and I am having tough time figuring out what the JOIN query should look like.
FYI, we are using MySQL version 5.6.49
.
Solution
This is an interesting example of where an (indirect) SELF-JOIN
s prove useful!
In order to answer your problem, I did the following (all the SQL below is available on the fiddle here):
I used your DDL and DML as supplied in the question - thanks for that (and a +1 - you've only asked two questions, so I'd consider you a New Contributor and it's great to see that some people take the trouble to supply DDL and DML - if only all OP's did the same!).
CREATE TABLE name_lookup
(
id INT UNSIGNED AUTO_INCREMENT,
contact_id INT,
contact_name VARCHAR(32),
PRIMARY KEY(id)
);
CREATE TABLE contact_info
(
id INT UNSIGNED AUTO_INCREMENT,
pri_contact_id INT,
sec_contact_id INT,
blah_blah VARCHAR(32),
PRIMARY KEY(id)
);
Populate them:
INSERT INTO name_lookup(contact_id, contact_name)
VALUES
(1, "John Doe"), (2, "Mary Smith"),
(3, "Jose Garcia"), (4, "Larry Brown");
and
INSERT INTO contact_info(pri_contact_id, sec_contact_id, blah_blah)
VALUES(1, 3, "Team A"), (4, 2, "Team B"), (1, NULL, "Team A");
Notice the final record which has sec_contact_id
= NULL
- see the discussion in the EDIT
below. I'm implicitly assuming that the team identity is determined by the pri_contact_id
- define as necessary.
You'll also notice that I use snake_case
or lower_case_with_underscores
- personal preference - pick a style and stick to it!.
I've left extra fields in the SQL in the fiddle so that you can see the thought processes involved and how I arrived at a solution!
SELECT
nl1.id, nl1.contact_id, nl1.contact_name,
ci1.pri_contact_id, ci1.sec_contact_id, ci1.blah_blah
FROM name_lookup nl1
JOIN contact_info ci1
ON nl1.contact_id = ci1.pri_contact_id;
Result:
id contact_id contact_name pri_contact_id sec_contact_id blah_blah
1 1 John Doe 1 3 Team A
4 4 Larry Brown 4 2 Team B
So, now we join name_lookup
back on itself using contact_info
as the link between them.
SELECT
nl1.id, nl1.contact_id, nl1.contact_name,
nl2.id, nl2.contact_id, nl2.contact_name,
ci1.pri_contact_id, ci1.sec_contact_id, ci1.blah_blah
FROM name_lookup nl1
JOIN contact_info ci1
ON nl1.contact_id = ci1.pri_contact_id
JOIN name_lookup nl2
ON ci1.sec_contact_id = nl2.contact_id
ORDER BY nl1.id;
Result:
id contact_id contact_name id contact_id contact_name pri_contact_id sec_contact_id blah_blah
1 1 John Doe 3 3 Jose Garcia 1 3 Team A
4 4 Larry Brown 2 2 Mary Smith 4 2 Team B
So, having obtained our result, we can now clean up the SQL (only SELECT
required fields - reduce any network traffic as well as I/O on the server) as follows:
SELECT
nl1.contact_name AS "Con_1 name",
nl2.contact_name AS "Con_2 name",
ci1.blah_blah AS "Team"
FROM name_lookup nl1
JOIN contact_info ci1
ON nl1.contact_id = ci1.pri_contact_id
JOIN name_lookup nl2
ON ci1.sec_contact_id = nl2.contact_id
ORDER BY nl1.id;
Result:
Con_1 name Con_2 name Team
John Doe Jose Garcia Team A
Larry Brown Mary Smith Team B
Et voilà - result as desired!
EDIT (NULLs in sec_contact_id):
It was pointed out to me that my answer wasn't as comprehensive as I might have liked it to be. What if sec_contact_id
is NULL
which, after all, is possible - you might have made the first one but a follow-up hasn't been completed yet?
So, I changed the table slightly and it now contains (as you'll see above - a considerably changed fiddle is available here - I wanted to run it on PostgreSQL also):
Con_1 name Con_2 name Team
John Doe Jose Garcia Team A
John Doe NULL Team B
Larry Brown Mary Smith Team B
so, now, you have to use INNER JOIN
s thus:
nl1.contact_name AS "Con_1 name",
nl2.contact_name AS "Con_2 name",
ci1.blah_blah AS "Team"
FROM contact_info ci1
LEFT JOIN name_lookup nl1
ON nl1.contact_id = ci1.pri_contact_id
LEFT JOIN name_lookup nl2
ON nl2.contact_id = ci1.sec_contact_id
ORDER BY nl1.id, ci1.blah_blah;
Result:
Con_1 name Con_2 name Team
John Doe Jose Garcia Team A
John Doe NULL Team B
Larry Brown Mary Smith Team B
So, now the record with the sec_contact_id = NULL
datum appears in your resultset.
A couple of words of advice:
You really should consider (strongly) consider upgrading from 5.6 to the current version of MySQL 8 - you'll get window functions, generated columns,
CHECK
constraints - it's at version 22 now, and I haven't heard many complaints, so it'd be a good option!Many people consider
NULL
s to be undesirable and avoid them like the plague - I tend to fall into this category. Therefore, you might wish to consider having two contact tables -pri_contact_info
andsec_contact_info
. You can decide if you wish to do that and ask a new question - if you do so, let me know here also!You might consider answering this question (a comment below your original post):
Can there ever be more than two people on a team? I e. There would be 3 columns with team members names for the same row?
!the
PRIMARY KEY
of thename_lookup
table should becontact_id
- the surrogate keyid
adds nothing to the mix! I assume thatcontact_id
is assigned more or less randomly to employees as they are hired? So, it's essentially a surrogate key itself. Surrogate keys have their place, but sometimes they're not the way to go!MySQL is by far and away the worst of the major RDBMS's when it comes to standards support, plus it has a plethora of non-standard
"extensions"
- in future, maybe you could delimit your SQL strings with apostrophes'
and not double quotes"
? My personalsuggestion
is that you use double quotes for aliases of field names for the final result (presentation, not content) as I have done - it will make your SQL more readable and portable!