Question

I'm trying to use the NLTK Named Entity Tagger to identify various named entities. In the book Natural Language Processing with Python they provide a list of commonly used named entitities, (Table 7.4, if anyone is curious) which include: DATE June, 2008-06-29 and TIME two fifty a m, 1:30 p.m. So I got the impresssion that this could be done with the NLTK's named entity tagger.

However, when I've run the tagger, it doesn't seem to pick up dates or times at all, as it does people or organizations. Does the NLTK named entity tagger not handle these date/time cases, or does it only pick up a specific date/time format? If it doesn't handle these cases, does anybody know of a system that does? Or is creating my own the only solution?

Thanks!

Was it helpful?

Solution

You should check out the contrib repository of NLTK - contains a module called timex.py or download it here: https://github.com/nltk/nltk_contrib/blob/master/nltk_contrib/timex.py

From the first line of the module:

# Code for tagging temporal expressions in text
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top