Question

I'm implementing full text search functionality on my rap website, and I'm running into some issues with rapper and song names.

For example, someone might want to search for the rapper "Cam'ron" using the query "camron" (leaving out the mid-word apostrophe). Likewise, someone might search for the song "3 Peat" using the query "3peat".

"The Notorious B.I.G." is a bit of a weird case: "The Notorious BIG" and "The Notorious B.I.G." both work (I guess because the solr.StandardFilterFactory removes dots from acronyms?), but "The Notorious B.I.G" (i.e., minus the trailing dot) doesn't.

Ideally all reasonable variations of these names should work. I'm guessing the answer has something to do with the solr.WordDelimiterFilterFactory, but I'm not sure.

Also, I'm using Sunspot with Rails if that's relevant.

Was it helpful?

Solution

Yes, you are right. You need to configure WordDelimiterFilterFactory properly. Try to enable all properties and don't forget to enable preserveOriginal property, which will save your original terms also.

generateWordparts - will make from B.I.G. terms - B I G

generateNumberParts - will make from 3Peat terms - 3 Peat

catenateWords - will make from B.I.G. terms - BIG

catenateNumbers - will make from Rapper 802.11 terms - Rapper 80211

catenateAll - will make from Rapper-802.11 term - Rapper80211

splitOnCaseChange - will make from GanGsTa terms - Gan Gs Ta

preserveOriginal - will save also original term. From Rapper-802.11RuuLlZ will make - Rapper-802.11RuuLlZ.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top