Question

My Lucene queries will usually exist of a bunch of AND combined fields. Is it possible to get the queried fields out of the Query object again?

No correct solution

OTHER TIPS

Did you mean extracting the terms or the field names? Since you already know you're handling a BooleanQuery, to extract the fields you can simply iterate the BooleanClause array returned by BooleanQuery.getClauses(), rewrite each clause to its base query (Query.rewrite) and apply recursively until you have a TermQuery on your hands.

If you did mean term extraction, I'm not sure about Lucene.NET, but in Java Lucene you can use org.apache.lucene.search.highlight.QueryTermExtractor; you pass a (rewritten) query to one of its getTerms overloads and get an array of WeightedTerms.

As far as I remember, the downsides to using this technique are:

  • Since it internally uses a term set it won't handle multiple instances of the same token, e.g. "dream within a dream"
  • It only supports base query types (TermQuery, BooleanQuery and any other query type which supports Query.extractTerms). I believe we've used it internally for SpanNearQuery and SpanNearOrderedQuery instances, but I may be wrong on this.

Either way I hope this is enough to get you started.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top