Question

I'm creating a dictionary app in Android with Lucene. Do I need to supply the same instance of StandardAnalyzer when indexing and searching, or can I just supply a new instance for both?

For example, when I'm about to create an index, I do this:

Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_36);
IndexWriter writer = new IndexWriter(directory,
                    new IndexWriterConfig(Version.LUCENE_36, analyzer));

And then, when getting the best fragments of the search term in the top documents, I do this:

TokenStream ts = TokenSources.getAnyTokenStream(indexSearcher.getIndexReader(),
                    hits[i].doc, "definition", analyzer);

Or can I just replace every usage of analyzer with new StandardAnalyzer(Version.LUCENE_36)? I'm asking this because my indexing and search tasks are in different classes and I'd like to keep a minimum number of objects I'm passing across instances.

Was it helpful?

Solution

You can definitely use different instances of the same analyzer/tokenizer.

The only requirement is to ensure they behave exactly the same way during searching and indexing (e.g. same object constructors should be used, have the same level of data access, etc.).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top