문제

I need to determine which part of a Lucene BooleanQuery failed if the entire query returns no results.

I'm using a BooleanQuery made up of 4 NumericRangeQueries and a PhraseQuery. Each is added to the query with Occur.MUST.

If I don't get any results for a query, is there a way to tell which part of the query failed to match anything? Do I need to run queries individually and compare results to get the one that failed?

Edit - Added PhraseQuery code.

if( row.getPropertykey_tx() != null && !row.getPropertykey_tx().trim().isEmpty()){
    PhraseQuery pQuery = new PhraseQuery();
    String[] words = row.getPropertykey_tx().trim().split(" ");
    for( String word : words ){
        pQuery.add(new Term(TitleRecordColumns.SA_SITE_ADDR.toString(), word));
    }
    pQuery.setSlop(2);

    topBQuery.add(pQuery, BooleanClause.Occur.MUST);
}
도움이 되었습니까?

해결책

Running individual parts of the query is probably the simplest approach, to my mind.

Another tool available is the getting an Explaination. You can call IndexSearcher.explain to get an Explanation of the scoring for the query against a particular document. If you can provide the docid of a document you believe should match the query, you can analyze Explanation.toString (or toHtml, if you prefer) to determine which subqueries are not matching against it.


If you want to automatically keep a record of which clause of a BooleanQuery doesn't produce results, I believe you will need to run each query independantly. If you no longer have access to the subqueries used to create it, you can get the clauses of it instead:

findTroublesomeQuery(BooleanQuery query) {
    for (BooleanClause clause : query.clauses()) {
        Query subquery = clause.getQuery()
        TopDocs docs = searchHoweverYouDo(subquery);
        if  (doc.totalSize == 0) {
            //If you want to dig down recursively...
            if (subquery instanceof BooleanQuery)
                findTroublesomeQuery(query);
            else 
                log(query); //Or do whatever you want to keep track of it.
        }
    }
}

DisjunctionMaxQuery is a commonly used query that wraps multiple subqueries as well, so might be worth considering for this sort of approach.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top