Question

I am trying to run a fulltext query using Postgresql that can cater for partial matches using wildcards.

It seems easy enough to have a postfix wildcard after the search term, however I cannot figure out how to specify a prefix wildcard.

For example, I can perform a postfix search easily enough using something like..

SELECT "t1".* 
FROM "t1" 
WHERE (to_tsvector('simple', "t1"."city") @@ to_tsquery('simple', 'don:*') )

should return results matching "London"

However I cant seem to do a prefix search like...

SELECT "t1".* 
FROM "t1" 
WHERE (to_tsvector('simple', "t1"."city") @@ to_tsquery('simple', ':*don') )

Ideally I'd like to have a wildcard prefixed to the front and end of the search term, something like...

SELECT "t1".* 
FROM "t1" 
WHERE (to_tsvector('simple', "t1"."city") @@ to_tsquery('simple', ':*don:*') )

I can use a LIKE condition however I was hoping to benefit from the performance of the full text search features in Postgres.

Was it helpful?

Solution

Full text search is good for finding words, not substrings.

For substring searches you'd better use like '%don%' with pg_trgm extension available from PostgreSQL 9.1 and using gin (column_name gin_trgm_ops) or using gist (column_name gist_trgm_ops) indexes. But your index would be very big (even several times bigger than your table) and write performance not very good.

There's a very good example of using pg_trgm for substring search on select * from depesz blog.

OTHER TIPS

One wild and crazy way of doing it would be to create a tsvector index of all your documents, reversed. And reverse your queries for postfix search too.

This is essentially what Solr does with its ReversedWildcardFilterFactory

select
reverse('brown fox')::tsvector @@ (reverse('rown') || ':*')::tsquery --true
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top