문제

I'm trying to generate a query in Java to search a Lucene index. The records in question have a recordState field so I'm starting my query with the following:

BooleanQuery booleanQuery = new BooleanQuery();
booleanQuery.add(new TermQuery(new Term("recordState", "DRAFT")), Occur.MUST);

The trouble comes when I want to add a filter supplied by the user. I had tried changing the code to:

String userQuery = ""; // This will be whatever the user types in
QueryParser queryParser = new QueryParser(Version.LUCENE_29, "", new StandardAnalyzer(Version.LUCENE_29,
  new HashSet<String>()));
BooleanQuery booleanQuery = new BooleanQuery();
booleanQuery.add(new TermQuery(new Term("recordState", "DRAFT")), Occur.MUST);
booleanQuery.add(queryParser.parse(userQuery), Occur.MUST);

If the user enters record_id:123 as their query, the query I end up with will be +recordState:DRAFT +record_id:123 - great. If the user enters +record_id:123, the final query is +recordState:DRAFT +(+record_id:123) - not ideal but it works.

But if a user enters -record_id:123, the final query is +recordState:DRAFT +(-record_id:123) which looks invalid & doesn't make much sense!

Is there a better way to combine the two query parts? I can't append the user's query as plain text as if they don't start the character with +/-, the query would end up as +recordState:DRAFT record_id:123 (record state = draft or record ID).

I'm thinking the only thing I can do is to test if the user query only contains one term & if so, remove any preceeding +/-. But I would like to do this without any String manipulation & sticking to the Lucene API.

도움이 되었습니까?

해결책

The syntax you provide isn't so terribly wrong, actually, but may not turn out like you expect.

The query:

-record_id:123

Is not very useful. Lucene does not support pure negative queries. Lucene needs to have something to search for, if it is only given things not to match, it will match nothing.

Since your goal appears to be just filtering to only documents with recordState = DRAFT, having been given a valid query, it would be reasonable to return no results, since the user entered query isn't really any good.

A query like:

+recordState:DRAFT +(-record_id:123 anotherfield:terms)

would be just fine, and:

+recordState:DRAFT +(+record_id:123)

The + inside the parentheses isn't really necessary, but it will also work without any problems.

If you want to detect a pure negative query like this, either by analyzing the query string, or iterating a BooleanQuery return from the parser, something like:

BooleanQuery query = (BooleanQuery)parser.parse;
BooleanClause[] clauses = query.getClauses();
if (clauses.length == 1 && clauses[0].getOccur() == BooleanClause.Occur.MUST_NOT) {
    booleanQuery.add(clauses[0]);
} else {
    booleanQuery.add(query, BooleanClause.Occur.MUST);
}

Which might deal with the very specific case, but keep in mind, if the user can create a query of arbitrary complexity, they can just as well create the same issue nested somewhere in the query. Also, the query parser is not guaranteed to return a BooleanQuery, so that's a bit of an assumption.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top