Question

I have a small program where users can type short sentences, 10-20 words long. Then I want to search in WordNet for user specified terms, and retrieve a list of senses that have hypernyms and hyponyms.

I want the senses that is most related to the specified term AND sentence to be displayed on the top of the list. It's not much text involved as user input, so I hope the processing will be fast. I found an excellent resource, but I wonder if I could simplify the process/code involved somehow? From p.32 in the pdf: shortcut to .pdf-file

  1. Loader - Loads data from a data source1 and turns it into a string.
  2. Parser - Takes a string and turns it into a document object by parsing it into sentences with words.
  3. POS-tagger - Takes a document object and determins Part-Of-Speech for each word.
  4. Sense relater - Takes a document object and find senses for each of the words.
  5. Stemmer - Takes a document object and stems all the words.
  6. Trimmer - Takes a document object and removes words from it.
  7. Includer - Takes a document object and adds words to it.

I also got this resource from a professor, but it uses Perl that I do not know, so I basically re-directed myself to just mentioned .pdf. If I could include the Perl script in my java application I guess i could use it. I'm searching for a solution, and got this thread as a result:

Is there any way of using SenseRelate in Java?

http://metacpan.org/pod/WordNet::SenseRelate::TargetWord

To finish this up: my use of the senseRelate code is basically to retrieve the most relevant senses first. The problem is that it uses Perl, and I could really need some Java-based API or anything to help me further. If anyone have any hints they are more than appreciated! :)

Was it helpful?

Solution

Assuming you haven't found a JAVA solution and as you suggested, it would be fairly straight forward to execute a Perl command from JAVA passing in appropriate arguments, then process its response from stdout. This seems like a perfectly ok technique to use. I have never written JAVA before, but here it goes...

// http://docs.oracle.com/javase/6/docs/api/java/lang/ProcessBuilder.html

Process p = new ProcessBuilder("/usr/bin/perl script.pl", "arg").start();
System.out.println(p.getInputStream()); // script.pl stdout
System.out.println(p.getErrorStream()); // script.pl error

You could use WordNet::SenseRelate::TargetWord to perform the core disambiguation, printing what you want to return to stdout.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top