Question

Please, the constructor of WordnetSynonymParser accept three parameters:

boolean dedup, boolean expand and an Analyzer.

But, what is dedup and expand? I don't understand.

The documentation cites:

If dedup is true then identical rules (same input, same output) will be added only once.

which means? An example? And the parameter expand?

Help me, please... thanks

Was it helpful?

Solution

The dedup value is passed directly to the SynonymMap.Builder, and does as it says. If two identical synonym rules exist, it only uses one of them. It's probably pretty safe to set this to true unless you have reason not to.

to understand expand, here's how it is used:

 if (expand) {
   for (int i = 0; i < size; i++) {
     for (int j = 0; j < size; j++) {
       add(synset[i], synset[j], false);
     }
   }
 } else {
   for (int i = 0; i < size; i++) {
     add(synset[i], synset[0], false);
   }
 }

So, if expand is true, it adds a synonym to the resulting set for each possible combination of synonyms in the set. If it is false, it would create synonym rules such that each synonym would be replaced only with the first synonym in the list. Say, if we had a set of synonymous words: "walk", "stroll" and "amble"

Expanded, this would generate the synonyms:

walk -> walk
walk -> stroll
walk -> amble
stroll -> walk
stroll -> stroll
stroll -> amble
amble -> walk
amble -> stroll
amble -> amble

Without expanding, you would just have:

walk -> walk
stroll -> walk
amble -> walk

Generally, I would be inclined to set this to false, so that synonym matches get reduced to one main synonym, but it does depend on your needs.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top