Question

How can I make Solr set more relevance in words based on position of the String.

For example, if I search "Macbook" the firsts results are like "Case Logic LAPS-113 13.3-Inch Laptop / MacBook Air" and after "Apple MacBook Pro MD101LL/A 13.3-Inch ".

This is my field declaration:

<fieldType name="text_pt" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>

        <filter class="solr.SynonymFilterFactory" synonyms="lang/index_synonyms.txt" ignoreCase="true" expand="false"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_pt.txt" />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" protected="lang/protwords_pt.txt"/>
        <filter class="solr.SynonymFilterFactory" synonyms="lang/synonyms.txt" ignoreCase="true" expand="false"/>
    </analyzer>

    <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>

        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_pt.txt" />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" protected="lang/protwords_pt.txt"/>
        <filter class="solr.SynonymFilterFactory" synonyms="lang/synonyms.txt" ignoreCase="true" expand="false"/>
    </analyzer>

</fieldType>
Was it helpful?

Solution

What if product name was "MacBook/Dell/Lenovo Laptop cheap case", it contains Macbook at first position, you still want to boost this document?

I think you should try to fix the root cause of your issue instead, the common issue of how to deal with accessories (such as 'case', 'battery', 'lock' etc) scoring better than the products themselves.

The obvious best choice: index a field that says if the doc is an accessory (I gather you don't have that info, otherwise this is the best way), and boost the ones that are not accessories.

If you don't have that info, you can try by penalizing the docs that contain 'typical' accessory words. For this you need to build such a list, but it is not hard. I have used this approach with good result.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top