Get query terms from Lucene query for highlighting
https://stackoverflow.com/questions/552829
No correct solution
OTHER TIPS
Did you mean extracting the terms or the field names? Since you already know you're handling a BooleanQuery, to extract the fields you can simply iterate the BooleanClause array returned by BooleanQuery.getClauses(), rewrite each clause to its base query (Query.rewrite) and apply recursively until you have a TermQuery on your hands.
If you did mean term extraction, I'm not sure about Lucene.NET, but in Java Lucene you can use org.apache.lucene.search.highlight.QueryTermExtractor; you pass a (rewritten) query to one of its getTerms overloads and get an array of WeightedTerms.
As far as I remember, the downsides to using this technique are:
- Since it internally uses a term set it won't handle multiple instances of the same token, e.g. "dream within a dream"
- It only supports base query types (TermQuery, BooleanQuery and any other query type which supports Query.extractTerms). I believe we've used it internally for SpanNearQuery and SpanNearOrderedQuery instances, but I may be wrong on this.
Either way I hope this is enough to get you started.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow