Running individual parts of the query is probably the simplest approach, to my mind.
Another tool available is the getting an Explaination. You can call IndexSearcher.explain
to get an Explanation of the scoring for the query against a particular document. If you can provide the docid of a document you believe should match the query, you can analyze Explanation.toString
(or toHtml
, if you prefer) to determine which subqueries are not matching against it.
If you want to automatically keep a record of which clause of a BooleanQuery doesn't produce results, I believe you will need to run each query independantly. If you no longer have access to the subqueries used to create it, you can get the clauses of it instead:
findTroublesomeQuery(BooleanQuery query) {
for (BooleanClause clause : query.clauses()) {
Query subquery = clause.getQuery()
TopDocs docs = searchHoweverYouDo(subquery);
if (doc.totalSize == 0) {
//If you want to dig down recursively...
if (subquery instanceof BooleanQuery)
findTroublesomeQuery(query);
else
log(query); //Or do whatever you want to keep track of it.
}
}
}
DisjunctionMaxQuery
is a commonly used query that wraps multiple subqueries as well, so might be worth considering for this sort of approach.