Fulls stops in a title are not searchable in SOLR

https://stackoverflow.com/questions/16804768

30-05-2022
|

Вопрос

I have a fairly straight forward SOLR search implementation using DataImportHandler. One of the fields is the name of a business. It creates a searchable field with Business Name as part of it as well as description.

The issue is that a search for a company called C.E.D. will not find it. I know it is there. A more general search does return a result.

Funny enough there is also a company called CED in the index. Searching C.E.D. does not return that company but searching CED does. However, searching CED does not return the company C.E.D.

As a write this I realise that what I probably need to do is change the business name field so it is consumed as is and no Filters mess with the actual combination of words or punctuation?

Решение

A normal configuration with WhitespaceTokenizerFactory can work for you.
This will create tokens on white space and lower case the text and would match the indexed terms.
C.E.D would match C.E.D and c.e.d

<fieldType name="text" class="solr.TextField">
    <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
       <filter class="solr.LowerCaseFilterFactory"/>    
    </analyzer>
</fieldType>

If you want C.E.D, c.e.d or ced to match C.E.D or c.e.d or ced, you would need to check WordDelimiterFilterFactory filter

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow