I believe this can be done with JWI as well, but it's not very intuitive.
Let's start with a lemmatized word. If you have a word that is not lemmatized, you should use a lemmatizer before searching for the word using JWI.
String lemma = ... // the lemmatized word
IRAMDictionary dict = new RAMDictionary(WN_DIR,ILoadPolicy.IMMEDIATE_LOAD);
IIndexWord indexWord = dict.getIndexWord(lemma, POS.NOUN); // or verbs, etc.
List<IWordID> wrdIDs = indexWord.getWordIDs();
for (IWordID id : wrdIDs) {
IWord word = dict.getWord(id);
int count = dict.getSenseEntry(word.getSenseKey()).getTagCount();
System.out.println("Synset: " + word.getSynset().getGloss());
System.out.println("Frequency: " + count);
}
This may look overly complicated, but note that we started with a word for this little code snippet, not a synset!
In JWI, each IWord
uniquely identifies a synset (although a synset will typically have more than word in it), so the approach to computing the frequency of each word sense is quite counter-intuitive (at least to me, it was).
The count is given by the getTagCount()
method, for which the documentation states
Returns the tag count for the sense entry. A tag count is a non-negative integer that represents the number of times the sense is tagged in various semantic concordance texts. A count of 0 indicates that the sense has not been semantically tagged.
Keep in mind, though, that the sense counts in WordNet are horribly outdated (as far as I can recall, they have not been updated since 2001).