문제

I want to extract terminological units from a corpus of specialized documents. Is there any algorithm or out-of-box solution for this? Can nltk do this?

It seems this thread addressed my question. Extracting terms with contextual relevance (noun phrases) from text in a .NET project

도움이 되었습니까?

해결책

The description of what you want isn't very clear. To get better help you should probably also post an example

It sounds like what you're looking for is called Named Entity Recognition. Depending exactly on what you want (and your data) there are existing systems that are very good, but the problem is definitely not solved. If this is what you want, important systems to look at are GATE, Apache OpenNLP and even NLTK.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top