Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to identify date from a string in Java

Tags:

Recently I am being challenged by quite an "easy" problem. Suppose that there is sentences (saved in a String), and I need to find out if there is any date in this String. The challenges is that the date can be in a lot of different formats. Some examples are shown in the list:

  • June 12, 1956
  • London, 21st October 2014
  • 13 October 1999
  • 01/11/2003

Worth mentioning that these are contained in one string. So as an example it can be like:

String s = "This event took place on 13 October 1999.";

My question in this case would be how can I detect that there is a date in this string. My first approach was to search for the word "event", and then try to localize the date. But with more and more possible formats of the date this solution is not very beautiful. The second solution that I tried is to create a list for months and search. This had good results but still misses the cases when the date is expressed all in digits.

One solution which I have not tried till now is to design regular expressions and try to find a match in the string. Not sure how much this solution might decrease the performance.

What could be a good solution that I should probably consider? Did anybody face a similar problem before and what solutions did you find?

One thing is for sure that there are no time, so the only interesting part is the date.

like image 806
bbakiu Avatar asked Nov 05 '15 14:11

bbakiu


People also ask

How do I check if a string is date?

Using the Date. One way to check if a string is date string with JavaScript is to use the Date. parse method. Date. parse returns a timestamp in milliseconds if the string is a valid date.

How do you check if the date is in YYYY MM DD format in java?

3. SimpleDateFormat – yyyy-M-d. For legacy Java application, we use SimpleDateFormat and . setLenient(false) to validate a date format.

How do I convert a string to a date?

Using strptime() , date and time in string format can be converted to datetime type. The first parameter is the string and the second is the date time format specifier. One advantage of converting to date format is one can select the month or date or time individually.


2 Answers

Using the natty.joestelmach.com library

Natty is a natural language date parser written in Java. Given a date expression, natty will apply standard language recognition and translation techniques to produce a list of corresponding dates with optional parse and syntax information.

import com.joestelmach.natty.*;

List<Date> dates =new Parser().parse("Start date 11/30/2013 , end date Friday, Sept. 7, 2013").get(0).getDates();
        System.out.println(dates.get(0));
        System.out.println(dates.get(1));

//output:
//Sat Nov 30 11:14:30 BDT 2013
//Sat Sep 07 11:14:30 BDT 2013
like image 79
NightSkyCode Avatar answered Sep 20 '22 16:09

NightSkyCode


You are after Named Entity Recognition. I'd start with Stanford NLP. The 7 class model includes date, but the online demo struggles and misses the "13". :(

Natty mentioned above gives a better answer.

like image 22
Michael Lloyd Lee mlk Avatar answered Sep 21 '22 16:09

Michael Lloyd Lee mlk