
I would like to use Lucene to index/search text. The text can contain mistyped words, names, etc. What is the most simple way of getting Lucene to find a document containing

"this is Licene" 

when user searches for


This is only for a demo app, so we need the most simple solution.

도움이 되었습니까?


Lucene's fuzzy queries and based on Levenshtein edit distance.

Use a fuzzy query in the QueryParser, with syntax like:


Or create a FuzzyQuery, passing in the maximum number of edits, something like:

Query query = new FuzzyQuery(new Term("field", "lucene"), 1);

Note: FuzzyQuery, in Lucene 4.x, does not support greater edit distances than 2.

다른 팁

Another option you could try is using the Lucene SpellChecker:


It is a out of box, and very easy to use:

  SpellChecker spellchecker = new SpellChecker(spellIndexDirectory);
  // To index a field of a user index:
  spellchecker.indexDictionary(new LuceneDictionary(my_lucene_reader, a_field));
  // To index a file containing words:
  spellchecker.indexDictionary(new PlainTextDictionary(new File("myfile.txt")));
  String[] suggestions = spellchecker.suggestSimilar("misspelt", 5);

By default, it is using the LevensteinDistance, but you could provide your own customized Edit Distance.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top