Filtering SQL query by row and by date range
문제
I have a time indexed Oracle DB which I'm trying to query by date range. I also want to do data reduction in the query so I don't get overwhelmed with too much data.
The stand alone date query (2352 rows in 0.203s):
select oracle_time from t_ssv_soh_packets0
where oracle_time >= TIMESTAMP '2009-01-01 00:00:00'
AND oracle_time <= TIMESTAMP '2009-01-31 00:00:00'
The stand along reduction query (1017 in 0.89s):
select oracle_time from t_ssv_soh_packets0
where (rowid,0) in (select rowid, mod(rownum,50) from t_ssv_soh_packets0)
When I try to combine them it takes forever (48 rows in 32.547s):
select oracle_time from t_ssv_soh_packets0
where oracle_time >= TIMESTAMP '2009-01-01 00:00:00'
AND oracle_time <= TIMESTAMP '2009-01-31 00:00:00'
AND (rowid,0) in (select rowid, mod(rownum,50) from t_ssv_soh_packets0)
Obviously I'm doing something fundamentally wrong here but I don't know how else to both query by date and reduce the data.
해결책 4
Thanks to both 'Narveson' and 'nate c' for the pointers I finally figured it out. Here is the (probably Oracle specific) query that I came up with:
select oracle_time from t_ssv_soh_packets0 where oracle_time >= TIMESTAMP '2009-01-01 00:00:00' AND oracle_time <= TIMESTAMP '2009-01-31 00:00:00' group by oracle_time, rownum having mod(rownum, 50)=0
This query returns 47 rows in 0.031s. The original time query had 2352 rows so that makes sense.
The ORAFAQ helped me get to the final solution.
다른 팁
You are evaluating your reduction logic against rows that are not in your chosen date range.
Apply the reduction logic to a subquery containing your date range.
LATER: Here's what I meant.
select oracle_time from (
select oracle_time, rownum as limited_row_num
from t_ssv_soh_packets0
where oracle_time >= TIMESTAMP '2009-01-01 00:00:00'
AND oracle_time <= TIMESTAMP '2009-01-31 00:00:00'
) as time_range
where mod(limited_row_num,50) = 0
Get rid of the in
Why use this?:
select oracle_time from t_ssv_soh_packets0
where (rowid,0) in (select rowid, mod(rownum,50) from t_ssv_soh_packets0)
Your only condition is mod(rownum, 50)
select * from t where mod(rownum, 50)=0
last line should be AND mod(rownnum,50)=0
not a self join with an in
.
You can also let Oracle choose a random sample from the result by applying the SAMPLE() clause:
SELECT oracle_time FROM t_ssv_soh_packets0 WHERE ... SAMPLE(50)
Will return randomly 50% percent of the rows