PostgreSQL text range scan

https://stackoverflow.com/questions/22898438

28-06-2023
|

Question

I have written a query whose aim is to get 10 results including the current one, padding up to 9 entries on either side for an alphabetical list which can be sorted by the reciever. This is the query I am using, my issue however is not with the result, but because neither of the queries is using an index.

(
  SELECT
    uid,
    title
  FROM
    books
  WHERE
  lower(title) < lower('Frankenstein')
  ORDER BY title desc
  LIMIT 9
)
UNION
(
  SELECT
    uid,
    title
  FROM
    books
  WHERE
  lower(title) >= lower('Frankenstein')
  ORDER BY title
  LIMIT 10
)
ORDER BY title;

The index I am trying to utilize is a simple btree, no text_pattern_ops etc as below:

CREATE INDEX books_title_idx ON books USING btree (lower(title));

If I run explain on the first part of the unioin, in spite of the limit and order, it performs a full table scan

explain analyze 
SELECT
  uid,
  title
FROM
  books
WHERE
lower(title) < lower('Frankenstein')
ORDER BY title desc
LIMIT 9

Limit (cost=69.04..69.06 rows=9 width=152) (actual time=6.276..6.292 rows=9 loops=1) -> Sort (cost=69.04..69.67 rows=251 width=152) (actual time=6.273..6.277 rows=9 loops=1) Sort Key: ((title)) Sort Method: top-N heapsort Memory: 25kB -> Seq Scan on books (cost=0.00..63.80 rows=251 width=152) (actual time=0.056..5.227 rows=267 loops=1) Filter: (lower((title)) < 'frankenstein'::text) Rows Removed by Filter: 486 Total runtime: 6.359 ms

when I do an equality check on the same query - the index is used

explain analyze
SELECT
  uid,
  title
FROM
  books
WHERE
lower(title) = lower('Frankenstein')
ORDER BY title desc

Sort (cost=17.04..17.05 rows=4 width=152) (actual time=0.054..0.054 rows=0 loops=1) Sort Key: ((title)) Sort Method: quicksort Memory: 25kB -> Bitmap Heap Scan on books (cost=4.31..17.00 rows=4 width=152) (actual time=0.041..0.041 rows=0 loops=1) Recheck Cond: (lower((title)) = 'frankenstein'::text) -> Bitmap Index Scan on books_title_idx (cost=0.00..4.31 rows=4 width=0) (actual time=0.036..0.036 rows=0 loops=1) Index Cond: (lower((title)) = 'frankenstein'::text) Total runtime: 0.129 ms

and the same applies when I do a between query

explain analyze
SELECT
  uid,
  title
FROM
  books
WHERE
lower(title) > lower('Frankenstein') AND lower(title) < lower('Gulliver''s Travels')
ORDER BY title

Sort (cost=17.08..17.09 rows=4 width=152) (actual time=0.511..0.529 rows=25 loops=1) Sort Key: (title) Sort Method: quicksort Memory: 27kB -> Bitmap Heap Scan on books (cost=4.33..17.04 rows=4 width=152) (actual time=0.118..0.213 rows=25 loops=1) Recheck Cond: ((lower(title) > 'frankenstein'::text) AND (lower(title) < 'gulliver''s travels'::text)) -> Bitmap Index Scan on books_title_idx (cost=0.00..4.33 rows=4 width=0) (actual time=0.087..0.087 rows=25 loops=1) Index Cond: ((lower(title) > 'frankenstein'::text) AND (lower(title) < 'gulliver''s travels'::text)) Total runtime: 0.621 ms

What I am obviously looking for here is not a between search because the beginning and end are unknown. So is this a postgresql limitation or is there something other than manipulating the cost of a table scan to something silly that I can use to convince the query planner to use the index?

I am using PostgreSQL 9.3

Solution

Use:

ORDER BY lower(title) DESC

ORDER BY lower(title)

to match your functional index, so it can be utilized.
ORDER BY is irrelevant for the selection of rows in the other two queries. That's why the index can be used in those cases.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow