Why the RIGHT JOIN
?
You potentially add NULL
to the result of the subquery, which is a common pitfall in a NOT IN
construct.
Consider this simple demo:
SELECT 5 NOT IN (VALUES (1), (2), (3)) --> TRUE
,5 NOT IN (VALUES (1), (2), (3), (NULL)) --> NULL (!)
,5 IN (VALUES (1), (2), (3)) --> FALSE
,5 IN (VALUES (1), (2), (3), (NULL)) --> NULL (!)
In other words:
"We do not know whether 5
is in the set, since at least one element is unknown and could be 5."
Since in a WHERE
clause only TRUE
is relevant (neither NULL
nor FALSE
pass the test), it wouldn't affect WHERE id IN (...)
at all.
But it affects
WHERE id NOT IN (...)
This expression never qualifies as soon as there is a NULL
in the right hand set.Probably not as intended?
Solution
The requirement is to
select all the Users that don't have LoginInfos
This can include users that don't have a row in table login
either. Therefore, we need LEFT JOIN
two times. And since it is not defined whether one user can have multiple rows in login
, we also need DISTINCT
or GROUP BY
:
SELECT DISTINCT u.*
FROM users u
LEFT JOIN login l ON l.user_id = u.id
LEFT JOIN logininfo i ON i.login_id = l.id
WHERE i.login_id IS NULL
This covers all eventualities. You can ...
- remove the
DISTINCT
if there can be at most one login per user. - replace the first
LEFT JOIN
withJOIN
if there is at least one row inlogin
for every user.
This alternative with NOT EXISTS
uses a subquery.
But it works no matter how many rows per user there can be in table login
. And it doesn't exhibit any of the aforementioned problems of NOT IN
:
SELECT u.*
FROM users u
WHERE NOT EXISTS (
SELECT 1
FROM login l
JOIN logininfo i ON i.login_id = l.id
WHERE l.user_id = u.id
)