Question

Intro to my problem: users can search for terms and RitaWordNet provides a method called getSenseIds() to get the related senses. By now I am using WS4J (WordNet Similarity for Java, http://code.google.com/p/ws4j/) that has different algorithms to define distance. A search for "user" has this result:

  • user
  • exploiter
  • drug user

http://wordnetweb.princeton.edu/perl/webwn?s=user&sub=Search+WordNet&o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&h=0

The Lin-distance is measured by comparing two terms in WS4J (with targetWord I assume?):

  • Similarity between: user and: user = 1.7976931348623157E308
  • Similarity between: user and: exploiter = 0.1976958835785797

I would like to return to the end-user a suggestion that the "user" sense is the most relevant/correct answer, but the problem is that this depends on the rest of the sentence.

Example: "The old man was a regular user of public transport", "The young man became became a drug user while studying NLP..".

I assume that the senserelate project has something included that I'm missing. This thread also got picked up during my search: word disambiguation algorithm (Lesk algorithm)

Hopefully someone got my question :)

Was it helpful?

Solution

You might want to try WordNet::SenseRelate::AllWords - there's an online demo at http://maraca.d.umn.edu

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top