Count distinct and Null value is eliminated by an aggregate
-
21-08-2019 - |
Question
I'm using SQL Server 2005. With the query below (simplified from my real query):
select a,count(distinct b),sum(a) from
(select 1 a,1 b union all
select 2,2 union all
select 2,null union all
select 3,3 union all
select 3,null union all
select 3,null) a
group by a
Is there any way to do a count distinct without getting
"Warning: Null value is eliminated by an aggregate or other SET operation."
Here are the alternatives I can think of:
- Turning ANSI_WARNINGS off
Separating into two queries, one with count distinct and a where clause to eliminate nulls, one with the sum:
select t1.a, t1.countdistinctb, t2.suma from ( select a,count(distinct b) countdistinctb from ( select 1 a,1 b union all select 2,2 union all select 2,null union all select 3,3 union all select 3,null union all select 3,null ) a where a.b is not null group by a ) t1 left join ( select a,sum(a) suma from ( select 1 a,1 b union all select 2,2 union all select 2,null union all select 3,3 union all select 3,null union all select 3,null ) a group by a ) t2 on t1.a=t2.a
Ignore the warning in the client
Is there a better way to do this? I'll probably go down route 2, but don't like the code duplication.
Solution
select a,count(distinct isnull(b,-1))-sum(distinct case when b is null then 1 else 0 end),sum(a) from
(select 1 a,1 b union all
select 2,2 union all
select 2,null union all
select 3,3 union all
select 3,null union all
select 3,null) a
group by a
Thanks to Eoin I worked out a way to do this. You can count distinct the values including the nulls and then remove the count due to nulls if there were any using a sum distinct.
OTHER TIPS
Anywhere you have a null possibly returned, use
CASE WHEN Column IS NULL THEN -1 ELSE Column END AS Column
That will sub out all your Null Values for -1 for the duration of the query and they'll be counted/aggregated as such, then you can just do the reverse in your fine wrapping query...
SELECT
CASE WHEN t1.a = -1 THEN NULL ELSE t1.a END as a
, t1.countdistinctb
, t2.suma
This is a late note, but being it was the return on Google, i wanted to mention it.
Changing NULL to another value is a Bad Idea(tm).
COUNT() is doing it, not DISTINCT.
Instead, use DISTINCT in an subquery and which returns a number, and aggregate that in the outer query.
A simple example of this is:
WITH A(A) AS (SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT 1)
SELECT COUNT(*) FROM (SELECT DISTINCT A FROM A) B;
This allows for COUNT(*)
to be used, which does not ignore NULLs (because it counts records, not values).
If you don't like the code duplication then why not use a common table expression? e.g.
WITH x(a, b) AS
(
select 1 a,1 b union all
select 2,2 union all
select 2,null union all
select 3,3 union all
select 3,null union all
select 3,null
)
select t1.a, t1.countdistinctb, t2.suma from
(
select a,count(distinct b) countdistinctb from
x a
where a.b is not null
group by a
) t1
left join
(
select a,sum(a) suma from
x a
group by a
) t2 on t1.a=t2.a