Question

I'm new to text mining and NLP. I'm trying to use named entity recognition (NER) (Stanford Named Entity Tagger) to extract dates from the given text. I'm using the online demo provided in Stanford NLP http://nlp.stanford.edu:8080/ner/process and GATE ANNIE http://services.gate.ac.uk/annie/

This demo is not able to recognize the texts like complete "last Sunday", "next Monday", "this month end" "till this Sunday evening" as date. Sunday or Monday alone will not be useful to determine the date. Is there any option to extract the actual text mentioned by the given example texts?

Example text:

Treat yourself with Puma as it offers Flat 50% off. Hurry offer valid till this Sunday. Happy Shopping.

Extracted date: 25-08-2013 (Considering today is 19-08-2013. Date format can be anything)

Any library provides this kind of date recognition feature or is it possible to build custom model to recognize date as given in the example text?

Was it helpful?

Solution

SUTime in Stanford CoreNLP can do temporal recognition. That page includes example code and it has an online demo available here.

Related question: Is it good to use stanford temporal tagger for big data?

OTHER TIPS

The Tagger_DateNormalizer plugin in GATE can do this, and it normalizes relative date expressions against today's date by default. You can override this with the sourceOfDocumentDate parameter to take the reference date from a document feature or from an annotation created by a previous step in the pipeline instead of using the current date (e.g. if you're processing news articles you'll probably want to normalize against the publication date rather than the date on which your pipeline runs).

While the plugin is called a date "normalizer" it is in fact a "tagger and normalizer" - it finds the date expressions in the text and annotates them with a normalized value, rather than taking a list of pre-existing Date annotations and normalizing those.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top