문제

Is it possible to modify Lucene 2.2 to add Arabic analyzer and if anyone have done this already where can I get source/jar

도움이 되었습니까?

해결책

someone asked me before how to get arabic and persian support on lucene 2.4

so these were unofficially backported here: http://people.apache.org/~rmuir/

http://people.apache.org/~rmuir/lucene-analyzers-2.4.1_with_arabic_and_farsi.jar http://people.apache.org/~rmuir/arabicFarsiLucene241_contrib.patch http://people.apache.org/~rmuir/arabicFarsiLucene241_core.patch

this would mean you only have to upgrade to 2.4.1, which might be easier than upgrading to 2.9 or 3.0.

hope this helps

다른 팁

Lucene 3.0.1 has Arabic Analyzer. It is in the contrib package.

You can upgrade to Lucene 3.0.1 to get this working out of the box. You probably will not be able to use this as it is for Lucene 2.2 since TokenStream APIs have changed in this release. But, back-porting changes to 2.2 shouldn't be very difficult, in case you don't wish to migrate to latest Lucene release.

Alternatively, you can try using lucene-hunspell for an analyzer. This is currently working with the Lucene trunk - I do not know whether it works with Lucene 3.0.1. Here is Robert Muir's explanation and a list of dictionaries, including Arabic. I believe you could also back-port this. Shashikant's suggestion seems easier to implement, while this one may be better quality.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top