Frage

I need to determine which part of a Lucene BooleanQuery failed if the entire query returns no results.

I'm using a BooleanQuery made up of 4 NumericRangeQueries and a PhraseQuery. Each is added to the query with Occur.MUST.

If I don't get any results for a query, is there a way to tell which part of the query failed to match anything? Do I need to run queries individually and compare results to get the one that failed?

Edit - Added PhraseQuery code.

if( row.getPropertykey_tx() != null && !row.getPropertykey_tx().trim().isEmpty()){
    PhraseQuery pQuery = new PhraseQuery();
    String[] words = row.getPropertykey_tx().trim().split(" ");
    for( String word : words ){
        pQuery.add(new Term(TitleRecordColumns.SA_SITE_ADDR.toString(), word));
    }
    pQuery.setSlop(2);

    topBQuery.add(pQuery, BooleanClause.Occur.MUST);
}
War es hilfreich?

Lösung

Running individual parts of the query is probably the simplest approach, to my mind.

Another tool available is the getting an Explaination. You can call IndexSearcher.explain to get an Explanation of the scoring for the query against a particular document. If you can provide the docid of a document you believe should match the query, you can analyze Explanation.toString (or toHtml, if you prefer) to determine which subqueries are not matching against it.


If you want to automatically keep a record of which clause of a BooleanQuery doesn't produce results, I believe you will need to run each query independantly. If you no longer have access to the subqueries used to create it, you can get the clauses of it instead:

findTroublesomeQuery(BooleanQuery query) {
    for (BooleanClause clause : query.clauses()) {
        Query subquery = clause.getQuery()
        TopDocs docs = searchHoweverYouDo(subquery);
        if  (doc.totalSize == 0) {
            //If you want to dig down recursively...
            if (subquery instanceof BooleanQuery)
                findTroublesomeQuery(query);
            else 
                log(query); //Or do whatever you want to keep track of it.
        }
    }
}

DisjunctionMaxQuery is a commonly used query that wraps multiple subqueries as well, so might be worth considering for this sort of approach.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top