Question

There must be a better way of writing this query.

I want to select all the data between a pair of dates. Ideally the first and last rows of the result set would be those specifed in the WHERE clause. If those rows don't exist, I want the rows preceeding and following the requested range.

An example:

If my data is:

...
135321, 20090311 10:15:00
135321, 20090311 10:45:00
135321, 20090311 11:00:00
135321, 20090311 11:15:00
135321, 20090311 11:30:00
135321, 20090311 12:30:00
...

And the query is:

    SELECT * 
    FROM data_bahf 
    WHERE param_id = 135321 
    AND datetime >= '20090311 10:30:00' 
    AND datetime <= '20090311 12:00:00'

I want the returned data to include the row at 10:15, and that of 12:30. Not just those that strictly meet the WHERE clause.

This is the best I've come up with.

SELECT * FROM (
    SELECT * 
    FROM data_bahf 
    WHERE param_id = 135321 
    AND datetime > '20090311 10:30:00' 
    AND datetime < '20090311 12:00:00'

    UNION

    (
        SELECT * FROM data_bahf 
        WHERE param_id = 135321 
        AND datetime <= '20090311 10:30:00' 
        ORDER BY datetime desc
        LIMIT 1
    )

    UNION

    (
        SELECT * FROM data_bahf 
        WHERE param_id = 135321 
        AND datetime >= '20090311 12:00:00'
        ORDER BY datetime asc
        LIMIT 1
    )
) 
AS A
ORDER BY datetime

(Ignore the use of SELECT * for now)

EDIT: I have indexes on param_id, datetime, and (param_id, datetime)

Was it helpful?

Solution

First, make sure that you have a composite index on (param_id, datetime)

Second, query like this:

SELECT  *
FROM    data_bahf
WHERE   param_id = 135321
        AND datetime BETWEEN
        COALESCE(
        (
        SELECT  MAX(datetime)
        FROM    data_bahf
        WHERE   param_id = 135321
              AND datetime <= '2009-01-01 00:00:00'
        ), '0001-01-01')
        AND 
        COALESCE(
        (
        SELECT  MIN(datetime)
        FROM    data_bahf
        WHERE   param_id = 135321
              AND datetime >= '2009-01-02 00:00:00'
        ), '9999-01-01')

Just checked, it runs in 1.215 ms for a sample table of 200,000 rows

OTHER TIPS

I'd say this:

SELECT 
  o.* 
FROM 
  data_bahf o
WHERE 
  o.param_id = 135321 
  AND o.datetime BETWEEN
  ISNULL(
    (
      SELECT   MAX(datetime) 
      FROM     data_bahf i
      WHERE    i.param_id = 135321 AND i.datetime <= '20090311 10:30:00'
    ),
    '0001-01-01 00:00:00'
  )
  AND
  ISNULL(
    (
      SELECT   MIN(datetime) 
      FROM     data_bahf i
      WHERE    i.param_id = 135321 AND i.datetime >= '20090311 12:00:00'
    ),
    '9999-12-31 23:59:59'
  )

EDIT: Fallback added.
When there is no row matching the sub-query, it will result in a NULL value, which must be caught by ISNULL() or the BETWEEN operator will fail and the main query will return no rows at all.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top