Question

I need an algorithm to tokenize given sentence into words which are correctly tagged to its grammar meaning.

for example: "People took to the streets and protested" people-noun took- adjective and-conjunction to- ...and so on

Was it helpful?

Solution

You mean you want part of speech tagging?

>>> import nltk
>>> tokens = nltk.word_tokenize("People took to the streets and protested")
>>> nltk.pos_tag(tokens)
[('People', 'NNS'), ('took', 'VBD'), ('to', 'TO'), ('the', 'DT'), ('streets', 'NNS'), ('and', 'CC'), ('protested', 'VBD')]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top