Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Named Entity Extraction of dates

I am absolutely new to the NER and Extraction and programming in general. I am trying to figure out a way where I can extract due dates and start date of certain documents. Is there a way to do this? A place where I can start? I have been looking around but the problem I run into is the same. Can extract dates but not whether the date is due or post. If it only has 1 date, is it post or due. Stuff like that. Any help would be appreciated.

Example:

"Essay on Medieval Asia was due on September 3rd."

"Your last assignment that was given on April 6th was supposed to be submitted in 10 days."

"The bid is due no later than a month from the date it was posted(today)."

like image 272
Sagar Saxena Avatar asked Sep 13 '17 08:09

Sagar Saxena


1 Answers

The amount of possibilities to express dates in free text is huge. There are a few solutions:

  • You can come with a set of regular expressions and try to parse them for yourself.

  • Another option is to train a supervised sequence classifier like CRF, if you have a document with dates annotated.

  • A third option, which can have quick results is to use this framework from Facebook research https://github.com/facebookincubator/duckling, it will identify expressions which are dates or time expressions, and it will even normalise them into a single unique date.

  • Yet another options is ct-parse, based on Duckling but a pure python package to parse time expressions from natural language in German and English.

like image 61
David Batista Avatar answered Sep 28 '22 07:09

David Batista