Count distinct and Null value is eliminated by an aggregate

https://stackoverflow.com/questions/851642

21-08-2019
|

Question

I'm using SQL Server 2005. With the query below (simplified from my real query):

select a,count(distinct b),sum(a) from 
(select 1 a,1 b union all
select 2,2 union all
select 2,null union all
select 3,3 union all
select 3,null union all
select 3,null) a
group by a

Is there any way to do a count distinct without getting

"Warning: Null value is eliminated by an aggregate or other SET operation."

Here are the alternatives I can think of:

Turning ANSI_WARNINGS off

Separating into two queries, one with count distinct and a where clause to eliminate nulls, one with the sum:

select t1.a, t1.countdistinctb, t2.suma from
(
    select a,count(distinct b) countdistinctb from 
    (
        select 1 a,1 b union all
        select 2,2 union all
        select 2,null union all
        select 3,3 union all
        select 3,null union all
        select 3,null
    ) a
    where a.b is not null
    group by a
) t1
left join
(
    select a,sum(a) suma from 
    (
        select 1 a,1 b union all
        select 2,2 union all
        select 2,null union all
        select 3,3 union all
        select 3,null union all
        select 3,null
    ) a
    group by a
) t2 on t1.a=t2.a

Ignore the warning in the client

Is there a better way to do this? I'll probably go down route 2, but don't like the code duplication.

Solution

select a,count(distinct isnull(b,-1))-sum(distinct case when b is null then 1 else 0 end),sum(a) from 
    (select 1 a,1 b union all
    select 2,2 union all
    select 2,null union all
    select 3,3 union all
    select 3,null union all
    select 3,null) a
    group by a

Thanks to Eoin I worked out a way to do this. You can count distinct the values including the nulls and then remove the count due to nulls if there were any using a sum distinct.

OTHER TIPS

Anywhere you have a null possibly returned, use

CASE WHEN Column IS NULL THEN -1 ELSE Column END AS Column

That will sub out all your Null Values for -1 for the duration of the query and they'll be counted/aggregated as such, then you can just do the reverse in your fine wrapping query...

SELECT  
    CASE WHEN t1.a = -1 THEN NULL ELSE t1.a END as a
    , t1.countdistinctb
    , t2.suma

This is a late note, but being it was the return on Google, i wanted to mention it.

Changing NULL to another value is a Bad Idea(tm).

COUNT() is doing it, not DISTINCT.

Instead, use DISTINCT in an subquery and which returns a number, and aggregate that in the outer query.

A simple example of this is:

WITH A(A) AS (SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT 1)
SELECT COUNT(*) FROM (SELECT DISTINCT A FROM A) B;

This allows for COUNT(*) to be used, which does not ignore NULLs (because it counts records, not values).

If you don't like the code duplication then why not use a common table expression? e.g.

WITH x(a, b) AS 
        (
                select 1 a,1 b union all
                select 2,2 union all
                select 2,null union all
                select 3,3 union all
                select 3,null union all
                select 3,null
        ) 
select t1.a, t1.countdistinctb, t2.suma from
(
        select a,count(distinct b) countdistinctb from 
        x a
        where a.b is not null
        group by a
) t1
left join
(
        select a,sum(a) suma from 
        x a
        group by a
) t2 on t1.a=t2.a

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow