Pregunta

I need an algorithm to tokenize given sentence into words which are correctly tagged to its grammar meaning.

for example: "People took to the streets and protested" people-noun took- adjective and-conjunction to- ...and so on

¿Fue útil?

Solución

You mean you want part of speech tagging?

>>> import nltk
>>> tokens = nltk.word_tokenize("People took to the streets and protested")
>>> nltk.pos_tag(tokens)
[('People', 'NNS'), ('took', 'VBD'), ('to', 'TO'), ('the', 'DT'), ('streets', 'NNS'), ('and', 'CC'), ('protested', 'VBD')]
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top