I want to write a code to match certain words. I don't care about the form of the word, it could be a noun and adding -ing to it, it can become a verb. Eg, add = adding, recruit = recruiting. Also, like recruit = recruitment = recruiter.

In simple words, all forms of the words are equal. Is there any Java program that I can use to achieve this.

I am somewhat familiar to Apache's OpenNLP, so if that could help in any way?

Thanks!!

有帮助吗?

解决方案

It sounds like you want a stemmer or lemmatizer. You might want to check out Stanford CoreNLP which includes a lemmatizer. You might also want to try the Porter Stemmer.

My guess is that these will cover some of the cases but not all of them. For example "recruitment" won't be lemmatized to "recruit." For that, you'd need a more complex morphological analyzer but I don't know of a good existing system.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top