Question

I am a complete noob when it comes to solr, this is my first configuration and I am having issues getting solr data to be filtered properly. We are using solr 4.0, the 09-21-2011 snapshot. What I want is to capitalize the first letter of each word in various fields. The data we index will have data like 'name' = 'STAR WARS'. What i want is when I query the data that the name should come back as 'Star Wars' but is comes back as 'STAR WARS'

Here is my setup

<fieldType name="text_capital" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>                
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.CapitalizationFilterFactory" onlyFirstWord="false" okPrefix="CVS"/>         
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">                      
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.CapitalizationFilterFactory" onlyFirstWord="false" okPrefix="CVS"/>                
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>

And here is the field mapping

<field name="name" type="text_capital" indexed="true" stored="true" />

Now when i look at the analyzer everything looks fine for both query and index it hits the tokenizer and all the filters properly, but when i run a query results come back with the name as all caps. I feel like i am missing something obvious here.

Thanks,

-zach

Was it helpful?

Solution

The value you refer as "coming back" is the stored value which is always the verbatim value you fed to Solr when indexing. Tokenizers, filters, etc, affect the indexed value which is used when searching (and the query terms). It's up to you to transform the stored value you get back into the form you want.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top