Question

I have a time indexed Oracle DB which I'm trying to query by date range. I also want to do data reduction in the query so I don't get overwhelmed with too much data.

The stand alone date query (2352 rows in 0.203s):

select oracle_time from t_ssv_soh_packets0
where oracle_time >= TIMESTAMP '2009-01-01 00:00:00' 
AND oracle_time <= TIMESTAMP '2009-01-31 00:00:00' 

The stand along reduction query (1017 in 0.89s):

select oracle_time from t_ssv_soh_packets0
where (rowid,0) in (select rowid, mod(rownum,50) from t_ssv_soh_packets0)

When I try to combine them it takes forever (48 rows in 32.547s):

select oracle_time from t_ssv_soh_packets0
where oracle_time >= TIMESTAMP '2009-01-01 00:00:00' 
AND oracle_time <= TIMESTAMP '2009-01-31 00:00:00' 
AND (rowid,0) in (select rowid, mod(rownum,50) from t_ssv_soh_packets0)

Obviously I'm doing something fundamentally wrong here but I don't know how else to both query by date and reduce the data.

Was it helpful?

Solution 4

Thanks to both 'Narveson' and 'nate c' for the pointers I finally figured it out. Here is the (probably Oracle specific) query that I came up with:

select oracle_time from t_ssv_soh_packets0 where oracle_time >= TIMESTAMP '2009-01-01 00:00:00' AND oracle_time <= TIMESTAMP '2009-01-31 00:00:00' group by oracle_time, rownum having mod(rownum, 50)=0

This query returns 47 rows in 0.031s. The original time query had 2352 rows so that makes sense.

The ORAFAQ helped me get to the final solution.

OTHER TIPS

You are evaluating your reduction logic against rows that are not in your chosen date range.

Apply the reduction logic to a subquery containing your date range.

LATER: Here's what I meant.

select oracle_time from (
  select oracle_time, rownum as limited_row_num
  from t_ssv_soh_packets0 
  where oracle_time >= TIMESTAMP '2009-01-01 00:00:00'  
  AND oracle_time <= TIMESTAMP '2009-01-31 00:00:00'  
) as time_range
where mod(limited_row_num,50) =  0

Get rid of the in

Why use this?:

select oracle_time from t_ssv_soh_packets0
where (rowid,0) in (select rowid, mod(rownum,50) from t_ssv_soh_packets0)

Your only condition is mod(rownum, 50)

select * from t where mod(rownum, 50)=0

last line should be AND mod(rownnum,50)=0 not a self join with an in.

You can also let Oracle choose a random sample from the result by applying the SAMPLE() clause:

SELECT oracle_time 
FROM t_ssv_soh_packets0
WHERE ...
SAMPLE(50)

Will return randomly 50% percent of the rows

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top