Question

In a Rails app I started using sunspot => https://github.com/sunspot/sunspot/blob/master/README.md

Everything went OK until I noticed this (taken from the rails-console):

1.9.3p194 :002 > MyModel.search{fulltext "leon"}.results
=> [#<MyModel id: 16, name: "Leon">]
1.9.3p194 :003 > MyModel.search{fulltext "león"}.results
=> [#<MyModel id: 18, name: "León">]

How can I tell the system not to make distinction between "leon" and "león" (I want smth like search{fulltext "leon"} => [#MyModel id: 16 ... , #MyModel id: 18...])

I've been looking for this problem and I've found every time the same response:

With this line in Gemfile works meanwhile the next release of rsolr: gem 'rsolr', :git => "https://github.com/mwmitchell/rsolr.git"

thx

Was it helpful?

Solution 2

Thx for the responses. At least I've solved it right last night with anohter idea I've taked from http://codeshooter.wordpress.com/2011/01/13/full-text-search-in-in-rails-with-sunspot-and-solr/

the idea is in Restaurant.rb

text :name do 
  self.name.my_normalize
end

and the function

to_s.mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/,'').downcase

that line works with strings like "äáàÁÄÀ" --- "aaaaaa"

OTHER TIPS

in the schema.xml you need to add a character filter as described in AnalyzersTokenizersTokenFilters for example:

<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>

and in the you should have mapping-ISOLatin1Accent.txt you should have entries that will map the unicode byte sequence to a asci character sequence. You can see an example here mapping-ISOLatin1Accent.txt

You need to make changes inside the Solr (the application, not the gem) configuration files. Solr is "embedded" in the gem, but you can access its configuration as if it were installed separately. Have a look at Solr documentation.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top