Question

Imagine these 3 tables:

User             Login:            LoginInfo:
----             -----             -----
id:integer       id:integer        id:integer
name:string      user_id:integer   login_id:integer

Is is possible to select all the Users that don't have LoginInfos without doing a subquery/subselect ( I wrote this without checking it against the DB ):

select id from users 
where id not in (
  select distinct(user_id) from logins 
  right join login_infos on logins.id = login_infos.login_id
)

I'm using Postgres as database.

Was it helpful?

Solution

Why the RIGHT JOIN?

You potentially add NULL to the result of the subquery, which is a common pitfall in a NOT IN construct.

Consider this simple demo:

SELECT 5 NOT IN (VALUES (1), (2), (3))         --> TRUE
      ,5 NOT IN (VALUES (1), (2), (3), (NULL)) --> NULL (!)
      ,5     IN (VALUES (1), (2), (3))         --> FALSE
      ,5     IN (VALUES (1), (2), (3), (NULL)) --> NULL (!)

In other words:
"We do not know whether 5 is in the set, since at least one element is unknown and could be 5."
Since in a WHERE clause only TRUE is relevant (neither NULL nor FALSE pass the test), it wouldn't affect WHERE id IN (...) at all.

But it affects

WHERE id NOT IN (...)
This expression never qualifies as soon as there is a NULL in the right hand set.
Probably not as intended?


Solution

The requirement is to

select all the Users that don't have LoginInfos

This can include users that don't have a row in table login either. Therefore, we need LEFT JOIN two times. And since it is not defined whether one user can have multiple rows in login, we also need DISTINCT or GROUP BY:

SELECT DISTINCT u.*
FROM   users u
LEFT   JOIN login     l ON l.user_id  = u.id
LEFT   JOIN logininfo i ON i.login_id = l.id
WHERE  i.login_id IS NULL

This covers all eventualities. You can ...

  • remove the DISTINCT if there can be at most one login per user.
  • replace the first LEFT JOIN with JOIN if there is at least one row in login for every user.

This alternative with NOT EXISTS uses a subquery.
But it works no matter how many rows per user there can be in table login. And it doesn't exhibit any of the aforementioned problems of NOT IN:

SELECT u.*
FROM   users u
WHERE  NOT EXISTS (
   SELECT 1
   FROM   login     l
   JOIN   logininfo i ON i.login_id = l.id
   WHERE  l.user_id = u.id
   )

OTHER TIPS

select distinct u.id
from
    user u
    inner join
    login l on u.id = l.user_id
    left join
    logininfo li on l.id = li.login_d
where li.id is null
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top