Combining SQL statements to optimize

https://stackoverflow.com//questions/21005796

20-12-2019
|

Question

I have 2 different queries on same table. The first one is of the form -

SELECT MIN(A) AS MIN_A, MAX(B) AS MAX_B
FROM myTable
WHERE timeStamp > ?
GROUP BY type;

And the other one:

SELECT SUM (CASE WHEN io > 0 THEN 1 ELSE 0 END) as io_cnt
FROM (
     SELECT
     (CASE WHEN SUM(io_ops) > 0 THEN 1 ELSE 0 END) as io
     FROM myTable
     WHERE timestamp > ? AND type = ?
     GROUP BY id
) t;

And table has columns - A, B, id, timestamp, type. Right now I call the first query from java, take the resultset output and loop over it to call the second query for every type that was returned in first query.

I need both MIN(A), MAX(B) things from first query and the io counts from second query. Is it possible to do it in one query ? I am using Amazon Redshift as my database.

Solution

Redshift is pretty limited. It is based on PostgreSQL 8.0.2 and many new features are not supported. This should work (untested):

SELECT t.type, min(min_a) AS min_a, max(max_b) AS max_b
      ,count(io > 0 OR NULL) AS io_cnt
FROM  (
   SELECT type, min(a) as min_a, max(b) as max_b
         ,sum(io_ops) AS io
   FROM   myTable
   WHERE  timestamp > ?
   GROUP  BY type, id
   ) t
GROUP  BY t.type;

Depending on data distribution, this might be faster or not:

SELECT t.type, m.min_a, m.max_b, count(io > 0 OR NULL) AS io_cnt
FROM  (
   SELECT type, sum(io_ops) AS io
   FROM   myTable
   WHERE  timestamp > ?
   GROUP  BY type, id
   ) t
JOIN  (
   SELECT type, min(a) as min_a, max(b) as max_b
   FROM   myTable
   WHERE  timeStamp > ?
   GROUP  BY type
   ) m USING (type)
GROUP  BY  t.type, m.min_a, m.max_b;

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow