Question

I'd like to change the similarity before searching index. What I do is:

QueryParser parser = new QueryParser(Version.LUCENE_43, "field", standarAnalyzer);
System.out.println("similarity before: " + parser.getFuzzyMinSim());
parser.setFuzzyMinSim(0.6f);
System.out.println("similarity after: " + parser.getFuzzyMinSim());
Query query = parser.parse(inputString); // inputString is given by the user
System.out.println("Querystring: " + query.toString());

and now, when inputString = "something~" then I get this output

similarity before: 2.0
similarity after: 0.5
Querystring: field:something~2 // Why 2!?

My questions:

  1. Why the similarity is set to 2.0 at the beginning (I thought it is 0.5 by default)?
  2. Why after calling setFuzzyMinSim method it is still 2.0?
Était-ce utile?

La solution

FuzzyQuery has been significantly changed in Lucene version 4. The number there after the '~' is a maximum edit distance, not a minimum similarity. I'm not really clear on how FuzzyMinSim is mapped to a maximum edit distance, as when the StandardQueryParser generates a FuzzyQuery. Note that using DefaultFuzzyMinSim in 4.x is deprecated.

An edit distance of 2 is the default maximum, and edit distances greater than 2 are not supported by the FuzzyQuery class, and thus are not supported by the standard query parser.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top