Solr - searches using two or more words only use first word for scoring

https://stackoverflow.com/questions/23594524

search
solr

20-07-2023
|

Question

When I issue a query

q=fulltext:marina zadar

Solr only uses the word "marina" to compute document scores. If I disable indexing term frequency and position, the scores are all the same for documents that contain both words and documents that only contain the word "marina".

This is when I use the request handler below

<requestHandler name="/select" class="solr.SearchHandler">
    <lst name="defaults">
        <str name="echoParams">explicit</str>
        <int name="rows">100</int>
        <str name="df">title</str>
     </lst>
</requestHandler>

When I define another request handler as

<requestHandler name="/full" class="solr.SearchHandler">
    <lst name="defaults">
        <str name="echoParams">explicit</str>
        <int name="rows">100</int>
        <str name="df">fulltext</str>
    </lst>
</requestHandler>

and issue the query to that handler as

q=marina zadar

all works fine - documents that contain both searched words are scored higher as expected.

Why is the query q=fulltext:marina zadar when using the /select handler scoring documents differently from the other example?

Here's my schma.xml

<schema name="example" version="1.5">

<fields>

    <field name="_version_" type="long" indexed="true" stored="true"/>
    <field name="id" type="long" indexed="true" stored="true" required="true" />
    <field name="name" type="string" indexed="true" stored="true" />
    <field name="subName" type="string" indexed="false" stored="true" />
    <field name="nearName" type="string" indexed="false" stored="true" />
    <field name="countryName" type="string" indexed="false" stored="true" />
    <field name="title" type="text_general_edge_ngram" indexed="true" stored="false" multiValued="true" />
    <field name="fulltext" type="text_general" indexed="true" stored="true" />

</fields>

<uniqueKey>id</uniqueKey>

<copyField source="name" dest="title" />
<copyField source="subName" dest="title" />

<!--<similarity class="com.pocketsail.solr.DescriptionSimilarity" />-->

<types>

    <fieldType name="string" class="solr.StrField" sortMissingLast="true" />
    <fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>

    <fieldType name="text_general_edge_ngram" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
            <tokenizer class="solr.LowerCaseTokenizerFactory"/>
            <filter class="solr.ASCIIFoldingFilterFactory"/>
            <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.LowerCaseTokenizerFactory"/>
            <filter class="solr.ASCIIFoldingFilterFactory"/>
        </analyzer>
    </fieldType>

    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" omitNorms="true" omitTermFreqAndPositions="true">
        <analyzer type="index">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.ASCIIFoldingFilterFactory"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.ASCIIFoldingFilterFactory"/>
        </analyzer>
    </fieldType>

</types>

</schema>

Solution

It turns out the words need to be enclosed in parentheses. If I issue

q=fulltext:(marina zadar)

both words are used in the scoring of the documents and the results are ordered as expected. Issuing q=fulltext:marina zadar looks for the word "marina" in the 'fulltext' field as specified and for the word "zadar" in the field which is set as default for the used query handler in solrconfig.xml.

This might have been a rookie error but perhaps it will help someone in the future.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow