Solr - identical search result scores for multiple search terms?

https://stackoverflow.com/questions/9008629

19-04-2021
|

Domanda

I would like to know how it is possible to get different scores for a multiple terms search result?

Certain results in solr have the same score even when there are multiple terms in the query as you will see in the example below.

I have two indexes in Solr, each containing: id, first_name, last_name Each index would look like the following:

<doc>
    <str name="id">1</str>
    <str name="last_name">fisher</str>
    <str name="name">john</str>
</doc>

<doc>
    <str name="id">2</str>
    <str name="last_name">darby</str>
    <str name="name">john</str>
</doc>

When I query just "john" both results come up. That is perfect. However, when I query "john fisher" both results come up but with the same score. What I want is different scores based on the relevancy of the search terms.

Here is the result for the following query http://localhost:8983/solr/select?q=john+fisher%0D%0A&rows=10&fl=*%2Cscore

<response>
    ...
    <result name="response" numFound="2" start="0" maxScore="0.85029894">
        <doc>
            <float name="score">0.85029894</float>
            <str name="id">1</str>
            <str name="last_name">fisher</str>
            <str name="name">john</str>
        </doc>

        <doc>
        <float name="score">0.85029894</float>
            <str name="id">2</str>
            <str name="last_name">darby</str>
            <str name="name">john</str>
        </doc>
    </result>
</response>

Any help would be greatly appreciated

Soluzione

Your best bet is to understand & analyse how different factors affect your document score, Lucene has helpful feature Explanation, Solr leverage this to provide how scoring is calculated you can use 'debugQuery' in Solr to see how it is derived,

?q=john&fl=score,*&rows=2&debugQuery=on

Ex Response:

<lst name="debug">
    <str name="rawquerystring">john</str>
    <str name="querystring">john</str>
    <str name="parsedquery">+DisjunctionMaxQuery((text:john))</str>
    <str name="parsedquery_toString">+(text:john)</str>
    <lst name="explain">
        <!-- Score calulation for Result#1 -->
        <str>
            2.1536596 = (MATCH) fieldWeight(text:john in 36722), product of:
            1.0 = tf(termFreq(text:john)=1)
            8.614638 = idf(docFreq=7591, maxDocs=15393998)
            0.25 = fieldNorm(field=text, doc=36722)
        </str>
        <!-- Score calulation for Result#2 -->
        <str>
            2.1536596 = (MATCH) fieldWeight(text:john in 36724), product of:
            1.0 = tf(termFreq(text:john)=1)
            8.614638 = idf(docFreq=7591, maxDocs=15393998)
            0.25 = fieldNorm(field=text, doc=36724)
        </str>
    </lst>

besides this, you can use explainOther to find out how a certain document did not match the query.

?q=john&fl=score,*&rows=2&debugQuery=on&explainOther=on

Do Read:

Altri suggerimenti

It looks to me that you are only searching on the "name" field. Thats why the scores are the same. If you use DisMax you can easily search on both fields and the most relevant will have a higher score.

e.g.

<str name="defType">edismax</str>
<str name="qf">name last_name</str>

Another way is to combine the 2 fields into 1 field with copyField and only search in the newly created field.

Thanks for the quick reply guys, I appreciate that.

From the explain query I was able to identify that indeed the search was only been performed on one field alone.

I saw that it is possible to add multiple fields to the same field for searching. In the schema.xml I added the following:

<copyField source="last_name" dest="text"/>

The results now come up as expected when using more than one search term.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow