質問

I need an algorithm to tokenize given sentence into words which are correctly tagged to its grammar meaning.

for example: "People took to the streets and protested" people-noun took- adjective and-conjunction to- ...and so on

役に立ちましたか?

解決

You mean you want part of speech tagging?

>>> import nltk
>>> tokens = nltk.word_tokenize("People took to the streets and protested")
>>> nltk.pos_tag(tokens)
[('People', 'NNS'), ('took', 'VBD'), ('to', 'TO'), ('the', 'DT'), ('streets', 'NNS'), ('and', 'CC'), ('protested', 'VBD')]
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top