Domanda

I have searched on the web for many days by it seems the internet has never heard of my problem:

I have a postal address database table holding about 37M records for United Kingdom, which has a geospatial index and a derived full text index created like so:

create index on gb_locations using gin(to_tsvector('english', "Postcode" || ' ' || "Postcode_outcode" || ' ' || "Road" || ' ' || "Neighbourhood" || ' ' || "Admin2" || ' ' || "Admin3");)

My full text search is in the form:

SELECT * FROM gb_locations
WHERE
    to_tsvector('english', "Postcode" || ' ' || "Postcode_outcode" || ' ' || "Road" || ' ' || "Neighbourhood" || ' ' || "Admin2" || ' ' || "Admin3") @@ plainto_tsquery('english', 'greenham road rg14')

The query works fine for most uk addresses, especially in the London area, but for locations furtuher afield the query returns no results.

I have verified that the record exists in the table as I can find it using a geospatial search but for a full text searches, it seems like the the database is not aware of it.

This is the explaination:

Bitmap Heap Scan on gb_locations  (cost=52.04..56.10 rows=1 width=521)
  Recheck Cond: (to_tsvector('english'::regconfig, ((((((((((("Postcode")::text || ' '::text) || ("Postcode_outcode")::text) || ' '::text) || "Road") || ' '::text) || ("Neighbourhood")::text) || ' '::text) || ("Admin2")::text) || ' '::text) || ("Admin3")::text)) @@ '''greenham'' & ''road'' & ''rg14'''::tsquery)
  ->  Bitmap Index Scan on text_search_index  (cost=0.00..52.04 rows=1 width=0)
        Index Cond: (to_tsvector('english'::regconfig, ((((((((((("Postcode")::text || ' '::text) || ("Postcode_outcode")::text) || ' '::text) || "Road") || ' '::text) || ("Neighbourhood")::text) || ' '::text) || ("Admin2")::text) || ' '::text) || ("Admin3")::text)) @@ '''greenham'' & ''road'' & ''rg14'''::tsquery)

Any poiners would be much appreciated.

È stato utile?

Soluzione

If certain fields can be NULL, you need to apply coalesce(field, '') on them in the global concatenation that gets the string to search into.

Otherwise it seems to work with the example values given in the comments:

select to_tsvector('english','RG147SW RG14 Greenham Road Newbury  West Berkshire')
  @@ plainto_tsquery('english', 'greenham road rg14');

 ?column? 
----------
 t
(1 row)

But this one won't match (the result is NULL) and this would be the case when Admin2 is NULL, or more generally any other field passed as is to the || operator.

   select to_tsvector('english','RG147SW RG14 Greenham Road ' || NULL || ' Newbury  West Berkshire')
      @@ plainto_tsquery('english', 'greenham road rg14');

?column? 
----------

(1 row)

Altri suggerimenti

Just to add to what Daniel Vérité said,

a full text index must be created as follows if any of the fields are expected to be NULL:

create index [index name] on [table name] using gin(to_tsvector('english', coalesce("Field1",'') || ' ' || coalesce("Field2",'') || ' ' || coalesce("Field3",'') ....));

Furthermore the same template must be used in the query itself as follows:

SELECT * FROM [table name] WHERE to_tsvector('english', coalesce("Field1",'') || ' ' || coalesce("Field2",'') || ' ' || coalesce("Field3",'') ....) @@ plainto_tsquery('english', '[your search sentance/phrase]');
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top