Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Date Extraction from Text

I am trying to use Stanford NLP tool to extract dates ( 8/11/2012 ) form text.

Here's a link! for the demo of this tool

Can u help me in how to train the classifier to identify date ( 8/11/2012 ).

I tried using training data as

Woodhouse PERS 8/18/2012 Date , O handsome O

but does not work for same test data .

like image 521
Swakesh Avatar asked Dec 15 '22 17:12

Swakesh


2 Answers

Using the NLP tool to extract dates from text seems like overkill if this is all you are trying to accomplish. You should consider other options like a simple Java regular expression (eg. here).

If you are doing something that requires more features from the Stanford NLP tool, take a look at the SUTime annotator. Their demo page will let you get a feel for how it behaves. Make sure to check the option Read rules from file and you will see that your date gets annotated.

Usage:

SUTime annotations are provided automatically with the StanfordCoreNLP pipeline by including the ner annotator.
like image 133
tysonjh Avatar answered Jan 03 '23 03:01

tysonjh


You can certainly train the CRF-based NER to recognize dates and times. You can see an example of that by running the supplied english.muc.7class.distsim.crf.ser.gz model. See the FAQ for training NER systems. But note that our primary tool for time/date recognition is now regex based: SUTime. You can also write rules for SUTime for other applications. See the SUTime page and the link to TokensRegex on that page.

like image 44
Christopher Manning Avatar answered Jan 03 '23 03:01

Christopher Manning