Question

I have added LanguageAnalysis in my schema file. After adding this stemming filter factory has started working but this made my some word unsearchable.

I have added in query time after the .

My schema file looks like :

    <schema name="test" version="1.50">
 <types>
    <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
    <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" />
    <fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0" omitNorms="true"/>
    <fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/>
    <fieldType name="date" class="solr.TrieDateField" precisionStep="6" positionIncrementGap="0"/>
    <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="100" side="front"/>
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="100" side="back"/>     
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
    </fieldType>
    <fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>

 </types>
 <fields>
   <field name="Id" type="int" indexed="true" stored="true" required="true" /> 
   <field name="Name" type="text" indexed="true" stored="true"  required="false" />
   <field name="ShortDescription" type="text" indexed="true" stored="true" required="false" />
   <field name="FullDescription" type="text" indexed="true" stored="false" required="false" />       

   <field name="_version_" type="long" indexed="true" stored="true"/>

 </fields>


    <uniqueKey>Id</uniqueKey>
    <copyField source="Name" dest="NameCopy"/>      
    <defaultSearchField>Name</defaultSearchField>
    <solrQueryParser defaultOperator="OR"/>
</schema>

Some particular word like battery, innovative, more etc are becomes not searchable after adding solr.PorterStemFilterFactory.

Why this is happening? I have also tried another filter factories for stemming and LanguageAnalysis but they also behaves same as this.

Please help me. I am confused what is going wrong here !!!

Was it helpful?

Solution

That is because the Porter stemming change your words when it tokenize them.

You have to apply the same stemming algorithm on your field also at indexation time. In your case add under the <analyzer type="index"> section this line:

<filter class="solr.PorterStemFilterFactory"/>

have a look at this page to see how the PorterStemmer works on your words: http://9ol.es/porter_js_demo.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top