I am using Sphinx to provide search to a website and I've run across a bit of a snag when returning relevant results.

To keep my question simple, let's assume that I have two fields, @title and @body, which are weighted 100 & 15 respectively. When I search for small words like the word 'in' I would like to have it rank exact matches for that search term higher and then check for matches to 'in*|*in|*in*' and rank them slightly lower. Is there any way to have this type of specificity for your searches?

Example results for 'in':

  1. Indian Food
  2. In The Middle
  3. Document about Latin

Some relevant settings are:

In sphinx.conf:

morphology              = stem_en
charset_type            = utf-8
min_word_len            = 2
min_prefix_len          = 0
min_infix_len           = 2
enable_star             = 1

In search.php

$sp->SetMatchMode( SPH_MATCH_EXTENDED2 );
$sp->SetRankingMode( SPH_RANK_PROXIMITY_BM25 );
$sp->SetFieldWeights ( array('title' => 100, 'body' => 15) );

Also, as a side note: I've also had some instances where partial matches don't even show up in the search results. For example, I have searched for Cow but Cowboy does not show up as a result. I have also searched for Cowb and Cowbo and it wasn't until I typed Cowboy that I received the expected result. Any thoughts?


This question is along the same lines as this previous SO question, but I hope I've given a little more detail as to my problem and the things I've tried to warrant a solution.

有帮助吗?

解决方案

Looks like morphologically Cow not related to Cowboy.

You could solve it in two ways:

  1. Use wordforms file with Cow > Cowboy
  2. As star is enabled you could change query from "Cow" to "Cow*" which will find all words starting with "Cow".

Regard different ranking for "in" and "in" I could suggest to have two body fields in index, lets say: body and body_star with the same content from body field.

in search.php

$sp->SetRankingMode( SPH_RANK_PROXIMITY_BM25 );
$sp->SetMatchingMode( SPH_MATCH_EXTENDED2 );
$sp->SetFieldWeights ( array('title' => 20, 'body' => 15, 'body_start' => 5) );
$sp->Query("@body in @body_star *in* @title in");

This should do the trick.

其他提示

Also you could set expand_keywords option in your config http://sphinxsearch.com/docs/1.10/conf-expand-keywords.html and set ranking mode to SPH_RANK_SPH04 http://sphinxsearch.com/blog/2010/08/17/how-sphinx-relevance-ranking-works/

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top