Lucene: new WordnetSynonymParser(boolean dedup,boolean expand, Analyzer analyzer)

StackOverflow https://stackoverflow.com/questions/18500670

  •  26-06-2022
  •  | 
  •  

Pregunta

Please, the constructor of WordnetSynonymParser accept three parameters:

boolean dedup, boolean expand and an Analyzer.

But, what is dedup and expand? I don't understand.

The documentation cites:

If dedup is true then identical rules (same input, same output) will be added only once.

which means? An example? And the parameter expand?

Help me, please... thanks

¿Fue útil?

Solución

The dedup value is passed directly to the SynonymMap.Builder, and does as it says. If two identical synonym rules exist, it only uses one of them. It's probably pretty safe to set this to true unless you have reason not to.

to understand expand, here's how it is used:

 if (expand) {
   for (int i = 0; i < size; i++) {
     for (int j = 0; j < size; j++) {
       add(synset[i], synset[j], false);
     }
   }
 } else {
   for (int i = 0; i < size; i++) {
     add(synset[i], synset[0], false);
   }
 }

So, if expand is true, it adds a synonym to the resulting set for each possible combination of synonyms in the set. If it is false, it would create synonym rules such that each synonym would be replaced only with the first synonym in the list. Say, if we had a set of synonymous words: "walk", "stroll" and "amble"

Expanded, this would generate the synonyms:

walk -> walk
walk -> stroll
walk -> amble
stroll -> walk
stroll -> stroll
stroll -> amble
amble -> walk
amble -> stroll
amble -> amble

Without expanding, you would just have:

walk -> walk
stroll -> walk
amble -> walk

Generally, I would be inclined to set this to false, so that synonym matches get reduced to one main synonym, but it does depend on your needs.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top