Pergunta

I want to write a code to match certain words. I don't care about the form of the word, it could be a noun and adding -ing to it, it can become a verb. Eg, add = adding, recruit = recruiting. Also, like recruit = recruitment = recruiter.

In simple words, all forms of the words are equal. Is there any Java program that I can use to achieve this.

I am somewhat familiar to Apache's OpenNLP, so if that could help in any way?

Thanks!!

Foi útil?

Solução

It sounds like you want a stemmer or lemmatizer. You might want to check out Stanford CoreNLP which includes a lemmatizer. You might also want to try the Porter Stemmer.

My guess is that these will cover some of the cases but not all of them. For example "recruitment" won't be lemmatized to "recruit." For that, you'd need a more complex morphological analyzer but I don't know of a good existing system.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top