Вопрос

Is there a Solr/Lucene filter for analyzing text in Latin (the language, not the script type)? They exist for many other languages (Italian, Czech, etc.) but Latin isn't included in the Solr distribution by default.

This makes sense, of course (no one speaks Latin any more...), but I'm hoping to find one. Perhaps there's a list of plugins somewhere I could see. It's difficult to search for because all of the results are just for Latin encoding blocks.

Нет правильного решения

Другие советы

Unless you need stemming features, StandardAnalyzer should be a reasonable starting point at least, though the default stop word set would not be particularly useful.

If you are looking for a stemmer, there is a LatinStemFilter out there as well. You can find it at LUCENE-4229. I don't really know how effective it is at this point, though.

There is an external project that does Latin stemming and Latin number convertion.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top