Question

I basically have 7 select statements that I need to have the results output into separate columns. Normally I would use a crosstab for this but I need a fast efficient way to go about this as there are over 7 billion rows in the table. I am using the vertica database system. Below is an example of my statements:

SELECT COUNT(user_id) AS '20100101' FROM event_log_facts WHERE date_dim_id=20100101
SELECT COUNT(user_id) AS '20100102' FROM event_log_facts WHERE date_dim_id=20100102
SELECT COUNT(user_id) AS '20100103' FROM event_log_facts WHERE date_dim_id=20100103
SELECT COUNT(user_id) AS '20100104' FROM event_log_facts WHERE date_dim_id=20100104
SELECT COUNT(user_id) AS '20100105' FROM event_log_facts WHERE date_dim_id=20100105
SELECT COUNT(user_id) AS '20100106' FROM event_log_facts WHERE date_dim_id=20100106
SELECT COUNT(user_id) AS '20100107' FROM event_log_facts WHERE date_dim_id=20100107

should return something like:

20100101 | 20100102 | 20100103 | 20100104 | 20100105 | 20100106 | 20100107
1234     | 1234     | 36564    | 45465    | 356754   | 3455     | 4556675
Was it helpful?

Solution

You could use a series of queries unioned together. Kind of ugly, but it should work

SELECT  
  COUNT(user_id) AS '20100101'  
 ,NULL AS '20100102'  
 ,NULL AS '20100103'  
 ,NULL AS '20100104'  
 ,NULL AS '20100105'  
FROM  
  event_log_facts  
WHERE  
  date_dim_id=20100101  
UNION  
SELECT  
  NULL AS '20100101'  
 ,COUNT(user_id) AS '20100102'  
 ,NULL AS '20100103'  
 ,NULL AS '20100104'  
 ,NULL AS '20100105'  
FROM   
  event_log_facts  
WHERE  
  date_dim_id=20100102  
UNION  
SELECT  
  NULL AS '20100101'  
 ,NULL AS '20100102'  
 ,COUNT(user_id) AS '20100103'  
 ,NULL AS '20100104'  
 ,NULL AS '20100105'  
FROM  
  event_log_facts  
WHERE  
  date_dim_id=20100103  

ETC...

OTHER TIPS

wrap them in parenthesis, add commas and select them :)

SELECT
(SELECT COUNT(user_id) FROM event_log_facts WHERE date_dim_id=20100101) AS '20100101',
(SELECT COUNT(user_id) FROM event_log_facts WHERE date_dim_id=20100102) AS '20100102',
(SELECT COUNT(user_id) FROM event_log_facts WHERE date_dim_id=20100103) AS '20100103',
(SELECT COUNT(user_id) FROM event_log_facts WHERE date_dim_id=20100104) AS '20100104',
(SELECT COUNT(user_id) FROM event_log_facts WHERE date_dim_id=20100105) AS '20100105',
(SELECT COUNT(user_id) FROM event_log_facts WHERE date_dim_id=20100106) AS '20100106',
(SELECT COUNT(user_id) FROM event_log_facts WHERE date_dim_id=20100107) AS '20100107'

Or you could make a scalar function that takes as parameter the date_dim_id and returns the result you want, and call it multiple times.. ( if your DB system supports scalar functions )

SELECT
COUNT(date_dim=20100101 OR NULL) AS '20100101',
COUNT(date_dim=20100102 OR NULL) AS '20100102',
...
FROM event_log_facts

Well, consider using pivot table. It is more EyeCandy :)

Firstly union your results, than pivot it!

Heres your example, and here is the SQLFiddle -> http://sqlfiddle.com/#!6/d41d8/6440

SELECT PivT.* 
FROM
(
  SELECT 10 As Quantity, '20100101' AS DateDim
  UNION
  SELECT 21 , '20100102' 
  UNION
  SELECT 3 , '20100103' 
  UNION
  SELECT 41 , '20100104' 
  UNION
  SELECT 50 , '20100105' 
  UNION
  SELECT 26 , '20100106' 
  UNION
  SELECT 78 , '20100107' 
) T
 PIVOT (avg(Quantity) for DateDim in ([20100101],
                         [20100102],
                         [20100103],
                         [20100104],
                         [20100105],
                         [20100106],
                         [20100107])
) As PivT
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top