Sequence of columns
Create a partial multicolumn index with the right sequence of columns. You have one:
"index_pings_on_created_at_and_monitor_id" btree (created_at DESC, monitor_id)
But the sequence of columns is not serving you well. Reverse it:
CREATE INDEX idx_pings_monitor_created ON pings (monitor_id, created_at DESC)
WHERE response_time IS NOT NULL;
The rule of thumb here is: equality first, ranges later. More about that:
Multicolumn index and performance
As discussed, the condition WHERE response_time IS NOT NULL
does not buy you much. If you have other queries that could utilize this index including NULL
values in response_time
, drop it. Else, keep it.
You can probably also drop both other existing indexes. More about the sequence of columns in btree indexes:
Working of indexes in PostgreSQL
Covering index
If all you need from the table is response_time
, this can be much faster yet - if you don't have lots of write operations on the rows of your table. Include the column in the index at the last position to allow index-only scans (making it a "covering index"):
CREATE INDEX idx_pings_monitor_created
ON pings (monitor_id, created_at DESC, response_time)
WHERE response_time IS NOT NULL; -- maybe
Or, you try this even ..
More radical partial index
Create a tiny helper function. Effectively a "global constant" in your db:
CREATE OR REPLACE FUNCTION f_ping_event_horizon()
RETURNS timestamp LANGUAGE sql IMMUTABLE COST 1 AS
$$SELECT '2014-03-03 0:0'::timestamp$$; -- One month in the past
Use it as condition in your index:
CREATE INDEX idx_pings_monitor_created_response_time
ON pings (monitor_id, created_at DESC, response_time)
WHERE response_time IS NOT NULL -- maybe
AND created_at > f_ping_event_horizon();
And your query looks like this now:
SELECT response_time
FROM pings
WHERE monitor_id = 3
AND response_time IS NOT NULL
AND created_at > '2014-03-03 20:23:07.254281'
AND created_at > f_ping_event_horizon();
Aside: I trimmed some noise.
The last condition seems logically redundant. Only include it, if Postgres does not understand it can use the index without it. Might be necessary. The actual timestamp in the condition must be bigger than the one in the function. But that's obviously the case according to your comments.
This way we cut all the irrelevant rows and make the index much smaller. The effect degrades slowly over time. Refit the event horizon and recreate indexes from time to time to get rid of added weight. You could do with a weekly cron job, for example.
When updating (recreating) the function, you need to recreate all indexes that use the function in any way. Best in the same transaction. Because the IMMUTABLE
declaration for the helper function is a bit of a false promise. But Postgres only accepts immutable functions in index definitions. So we have to lie about it. More about that:
Does PostgreSQL support "accent insensitive" collations?
Why the function at all? This way, all the queries using the index can remain unchanged.
With all of these changes the query should be faster by orders of magnitude now. A single, continuous index-only scan is all that's needed. Can you confirm that?