I'd like to change the similarity before searching index. What I do is:

QueryParser parser = new QueryParser(Version.LUCENE_43, "field", standarAnalyzer);
System.out.println("similarity before: " + parser.getFuzzyMinSim());
parser.setFuzzyMinSim(0.6f);
System.out.println("similarity after: " + parser.getFuzzyMinSim());
Query query = parser.parse(inputString); // inputString is given by the user
System.out.println("Querystring: " + query.toString());

and now, when inputString = "something~" then I get this output

similarity before: 2.0
similarity after: 0.5
Querystring: field:something~2 // Why 2!?

My questions:

  1. Why the similarity is set to 2.0 at the beginning (I thought it is 0.5 by default)?
  2. Why after calling setFuzzyMinSim method it is still 2.0?
有帮助吗?

解决方案

FuzzyQuery has been significantly changed in Lucene version 4. The number there after the '~' is a maximum edit distance, not a minimum similarity. I'm not really clear on how FuzzyMinSim is mapped to a maximum edit distance, as when the StandardQueryParser generates a FuzzyQuery. Note that using DefaultFuzzyMinSim in 4.x is deprecated.

An edit distance of 2 is the default maximum, and edit distances greater than 2 are not supported by the FuzzyQuery class, and thus are not supported by the standard query parser.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top