Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting dates from text in Java

Tags:

java

string

Is it possible to extract dates from a string in Java?

I have 500+ string with different data. In them, there can be:
"... period from 08.23.2011 - 09.05.2011..."
and also:
"...period ends 06.09.2011...".

It's not certain that the above string are there, but they can be.

Is it possible to extract the 3 dates and get them in Date format?

like image 447
user649542 Avatar asked Feb 24 '23 20:02

user649542


1 Answers

In essence regex is the answer for recognition, but there are lots and lots of ways to express dates and time periods, so if you want a good solution, you probably want to use an existing well-tuned set of regex. There's then a second phase of interpretation, which needs more flexibility than what JodaTime will parse out of the box. So for a robust solution, you probably want to use one of the systems that have been built in the natural language processing community, such as SUTime, HeidelTime or GUTime.

like image 113
Christopher Manning Avatar answered Feb 27 '23 16:02

Christopher Manning