How do I return my records grouped by NULL and NOT NULL?

https://stackoverflow.com/questions/239545

04-07-2019
|

Question

I have a table that has a processed_timestamp column -- if a record has been processed then that field contains the datetime it was processed, otherwise it is null.

I want to write a query that returns two rows:

NULL        xx -- count of records with null timestamps
NOT NULL    yy -- count of records with non-null timestamps

Is that possible?

Update: The table is quite large, so efficiency is important. I could just run two queries to calculate each total separately, but I want to avoid hitting the table twice if I can avoid it.

Solution

Oracle:

group by nvl2(field, 'NOT NULL', 'NULL')

OTHER TIPS

In MySQL you could do something like

SELECT 
    IF(ISNULL(processed_timestamp), 'NULL', 'NOT NULL') as myfield, 
    COUNT(*) 
FROM mytable 
GROUP BY myfield

In T-SQL (MS SQL Server), this works:

SELECT
  CASE WHEN Field IS NULL THEN 'NULL' ELSE 'NOT NULL' END FieldContent,
  COUNT(*) FieldCount
FROM
  TheTable
GROUP BY
  CASE WHEN Field IS NULL THEN 'NULL' ELSE 'NOT NULL' END

Try the following, it's vendor-neutral:

select
    'null    ' as type,
    count(*)   as quant
    from       tbl
    where      tmstmp is null
union all
select
    'not null' as type,
    count(*)   as quant
    from       tbl
    where      tmstmp is not null

After having our local DB2 guru look at this, he concurs: none of the solutions presented to date (including this one) can avoid a full table scan (of the table if timestamp is not indexed, or of the indexotherwise). They all scan every record in the table exactly once.

All the CASE/IF/NVL2() solutions do a null-to-string conversion for each row, introducing unnecessary load on the DBMS. This solution does not have that problem.

If it's oracle then you can do:

select decode(field,NULL,'NULL','NOT NULL'), count(*)
from table
group by decode(field,NULL,'NULL','NOT NULL');

I'm sure that other DBs allow for similar trick.

Stewart,

Maybe consider this solution. It is (also!) vendor non-specific.

SELECT count([processed_timestamp]) AS notnullrows, 
       count(*) - count([processed_timestamp]) AS nullrows 
FROM table

As for efficiency, this avoids 2x index seeks/table scans/whatever by including the results on one row. If you absolutely require 2 rows in the result, two passes over the set may be unavoidable because of unioning aggregates.

Hope this helps

Another MySQL method is to use the CASE operator, which can be generalised to more alternatives than IF():

SELECT CASE WHEN processed_timestamp IS NULL THEN 'NULL' 
            ELSE 'NOT NULL' END AS a,
       COUNT(*) AS n 
       FROM logs 
       GROUP BY a

If your database has an efficient COUNT(*) function for a table, you could COUNT whichever is the smaller number, and subtract.

SQL Server (starting with 2012):

SELECT IIF(ISDATE(processed_timestamp) = 0, 'NULL', 'NON NULL'), COUNT(*)
FROM MyTable
GROUP BY ISDATE(processed_timestamp);

I personally like Pax's solution, but if you absolutely require only one row returned (as I had recently), In MS SQL Server 2005/2008 you can "stack" the two queries using a CTE

with NullRows (countOf)
AS
(
    SELECT count(*) 
    FORM table 
    WHERE [processed_timestamp] IS NOT NULL
)
SELECT count(*) AS nulls, countOf
FROM table, NullRows
WHERE [processed_timestamp] IS NULL
GROUP BY countOf

Hope this helps

[T-SQL]:

select [case], count(*) tally
from (
  select 
  case when [processed_timestamp] is null then 'null'
  else 'not null'
  end [case]
  from myTable
) a

And you can add into the case statement whatever other values you'd like to form a partition, e.g. today, yesterday, between noon and 2pm, after 6pm on a Thursday.

Select Sum(Case When processed_timestamp IS NULL
                         Then 1
                         Else 0
                 End)                                                               not_processed_count,
          Sum(Case When processed_timestamp Is Not NULL
                         Then 1
                         Else 0
                 End)                                                               processed_count,
          Count(1)                                                                total
From table

Edit: didn't read carefully, this one returns a single row.

In Oracle

SELECT COUNT(*), COUNT(TIME_STAMP_COLUMN)
FROM TABLE;

count(*) returns the count of all rows

count(column_name) returns the number of rows which are not NULL, so

SELECT COUNT(*) - COUNT(TIME_STAMP_COLUMN) NUL_COUNT,
                  COUNT(TIME_STAMP_COLUMN) NON_NUL_COUNT
FROM TABLE

ought to do the job.

If the column is indexed, you might end up with some sort of range scan and avoid actually reading the table.

Another way in T-sql (sql-server)

select  count(case when t.timestamps is null 
                    then 1 
                    else null end) NULLROWS,
        count(case when t.timestamps is not null 
                    then 1 
                    else null end) NOTNULLROWS
from myTable t

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow