Why does my SELECT query not return null values? [duplicate]
-
04-03-2021 - |
Question
I have the following SQL script:
CREATE temporary table if not EXISTS the_values (
key SERIAL,
value INTEGER NULL
);
insert into the_values(value) values (null),(1),(null),(2),(3),(4),(5),(6),(10),(null),(null);
select *
from the_values
where value not in (1,2,3,4,5,6,10);
And I noticed that the query:
select *
from the_values
where value not in (1,2,3,4,5,6,10);
Does not return the rows having value
NULL, and that caught my attention. Therefore, I want to know why that happens. I am interested more about the technical aspect of this phenomenon rather that the obvious solution:
select *
from the_values
where value not in (1,2,3,4,5,6,10)
or value IS NULL;
La solution
If we simplify the insert and query to:
insert into T(x) values (null),(1);
select x from T where x not in (1);
For null the predicate will evaluate to:
select x from T where null not in (1) <=>
select x from T where not null in (1) <=>
select x from T where not null <=>
select x from T where null
so that row does not satisfy the predicate. If you try to compare something with null (think of it as unknown), the result will be null.
For 1 the predicate will evaluate to:
select x from T where 1 not in (1) <=>
select x from T where False
so that row does not satisfy the predicate either
I.e. you end up with an empty result
Autres conseils
Generally, avoid NOT IN
when NULL
values can be involved on either side. The Postgres Wiki suggests as much. And avoid NOT IN (SELECT ...)
in any case. The Wiki has the explanation. And also happens to have the perfect answer to your core question:
NOT IN
behaves in unexpected ways if there is a null present:
select * from foo where col not in (1,null); -- always returns 0 rows
select * from foo where col not in (select x from bar); -- returns 0 rows if any value of bar.x is null
This happens because
col IN (1,null)
returnsTRUE
ifcol=1
, andNULL
otherwise (i.e. it can never returnFALSE
). SinceNOT (TRUE)
isFALSE
, butNOT (NULL)
is stillNULL
, there is no way thatNOT (col IN (1,null))
(which is the same thing as colNOT IN (1,null))
can returnTRUE
under any circumstances.
Here is a maybe not so obvious solution:
SELECT *
FROM the_values
WHERE value IN (1,2,3,4,5,6,10) IS NOT TRUE;
Or:
..
WHERE value = ANY('{1,2,3,4,5,6,10}') IS NOT TRUE;
It's shorter and typically faster than your "obvious" one.
For long lists, consider switching to a different (faster) technique: