Question

I have a character varying field in postgres containing a 1-white-space-separated set of strings. E.g.:

  • --> one two three <--
  • --> apples bananas pears <--

I put --> and <-- to show where the strings start and end (they are not part of the stored string itself)

I need to query this field to find out if the whole string contains a certain word (apple for instance). A possible query would be

SELECT * FROM table WHERE thefield LIKE '%apple%'

But it sucks and won't scale as b-tree indexes only scale if the pattern is attached to the beginning of the string while in my case the searched string could be positioned anywhere in the field.

How would you recommend approaching the problem?

Was it helpful?

Solution

Consider database-normalization first.

While working with your current design, support the query with a trigram index, that will be pretty fast.

More details and links in this closely related answer:
PostgreSQL LIKE query performance variations

Even more about pattern matching and indexes in this related answer on dba.SE:
Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top