Question

Lets say i have a binary field checked Lets also assume that 3 documents out of 10 has checked:1 others checked:0

When I search in lucene

checked:1 - returns correct result (3)
checked:0 - returns correct result (7)
-checked:1 - returns correct result (7)
-checked:0 - returns correct result (3)

BUT

-(-(checked:1)) - suddenly returns wrong result (10, i.e. entire data set).

Any idea why lucene query parse acts so weird

Was it helpful?

Solution 2

After some research and trial and error and building up on answer from midas, I have came up with the method to resolve this inconsistency. When I say inconsistency, I mean from a common sense view for a user. From information retrieval prospective, midas has linked an interesting article, which explains why such a query makes no sense. So, the trick is to keep each negative expression with MatchAllDocsQueryNode class, namely the rewritten query has to look like this:

 -(-(checked:1 *:*) *:*)

Then the query will produce the expected result. I have accomplished it by writing my own nodeprocessor class, which performs necessary operations.

OTHER TIPS

Each Lucene query has to contain at least one positive term (either MUST/+ or SHOULD) so it matches at least one document. So your queries -checked:1 and -checked:0 are invalid, and I am surprised you are getting any results.

These queries should (most likely) look like this:

  • +*:* -checked:1
  • +*:* -checked:0

Getting back to your problem: double negation makes no sense in Lucene. Why would you have double negation, what are you trying to query?

Generally speaking, don't look at Lucene query operators (! & |) as Boolean operators, they aren't exactly what you think they are.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top