Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java English date format parsing

Tags:

java

My problem is the following. I am reading a file and it contains a bunch of CSV lines. Each line contains some date in the format of 22-mar-2010 or similar i.e. with the format dd-MMM-yyyy. I want to convert this to ISO format so it becomes 2010-03-22.

The code I have looks like this:

  private String convertDate(String date) {
    DateTimeFormatter oldFormat = DateTimeFormatter.ofPattern("dd-MMM-yyyy", new Locale("en"));
    LocalDate parsedDate = LocalDate.parse(date, oldFormat);

    DateTimeFormatter newFormat = DateTimeFormatter.ISO_DATE;
    String newDate = parsedDate.format(newFormat);
    return newDate;
  }

The input looks something like this:

sdfdsfslk 28-mar-2007 dfdsljs
sdfdsfslk 20-apr-2014 dfdsljs
sdfdsfslk 13-oct-2005 dfdsljs
sdfdsfslk 20-may-2014 dfdsljs
sdfdsfslk 20-jan-2014 dfdsljs
sdfdsfslk 20-feb-2014 dfdsljs

If include the locale as above or use withLocale(Locale.ENGLISH) then it fails at the first row date string. The exception is:

java.time.format.DateTimeParseException: Text '28-mar-2007' could not be parsed at index 3

If I remove the locale part and just have:

DateTimeFormatter.ofPattern("dd-MMM-yyyy");

Then it works until it encounters a date such as 13-oct-2005. It does not like the English 'oct' and fails at the LocalDate.parse row. If I convert oct to okt (Swedish, where I am) then it parses it.

Do I need to change my Locale completely or what is going wrong here? How can I get it to parse dates with months in English even though I am in Sweden?

like image 251
Souciance Eqdam Rashti Avatar asked Aug 29 '16 16:08

Souciance Eqdam Rashti


2 Answers

I think the problem is that the first letter of the month is lowercase. When you run the same code for 28-Mar-2007 instead of 28-mar-2007 everything works fine.

One quick and dirty solution is:

private String convertDate(String mydate) {

        String date = mydate;
        String firstLetter = date.substring(0,4).toUpperCase();
        String restLetters = date.substring(4).toLowerCase();
        date = firstLetter+restLetters;

        DateTimeFormatter oldFormat = DateTimeFormatter.ofPattern("dd-MMM-yyyy", new Locale("en"));
    LocalDate parsedDate = LocalDate.parse(date, oldFormat);

    DateTimeFormatter newFormat = DateTimeFormatter.ISO_DATE;
    String newDate = parsedDate.format(newFormat);
   return newDate;
  }
like image 102
PKey Avatar answered Oct 06 '22 13:10

PKey


tl;dr

LocalDate.parse ( 
    "13-oct-2005" , 
    new DateTimeFormatterBuilder()
        .parseCaseInsensitive()
        .appendPattern( "dd-MMM-uuuu" )
        .toFormatter( Locale.US ) 
)

Details

The Answer by Plirkee is correct: The English locales expect abbreviated month name to have an initial capital letter (uppercase).

DateTimeFormatterBuilder

Given this faulty input data, an easier workaround is to build a formatter that is case-insensitive. The DateTimeFormatterBuilder class enables you to build more finely customized formatters that you can with a mere formatting code string pattern.

The java.time classes including DateTimeFormatter and DateTimeFormatterBuilder are thread-safe. So you can keep an instance around for repeated use.

Builder pattern

Read up on the Builder design pattern if not familiar. Rather than call a constructor with a multitude of arguments, construct a Builder object with a call chain of various methods to suit your needs. At the end, ask that Builder to instantiate the object you really want, a DateTimeFormatter in this case.

.parseCaseInsensitive()

The trick we need is the call to .parseCaseInsensitive(). You can verify that this call is the crucial ingredient by swapping with the commented-out line that omits this call.

//  DateTimeFormatterBuilder fbuilder = new DateTimeFormatterBuilder ().appendPattern ( "dd-MMM-uuuu" );  // Case-sensitive by default.
DateTimeFormatterBuilder fbuilder = new DateTimeFormatterBuilder ().parseCaseInsensitive ().appendPattern ( "dd-MMM-uuuu" );  // Case-insensitive to handle improper English.

String input = "13-oct-2005"; // Incorrect English. Should be uppercase 'Oct'.
DateTimeFormatter f = fbuilder.toFormatter ( Locale.US );
LocalDate ld = LocalDate.parse ( input , f );

ld.toString() → 2005-10-13

ISO 8601

Tip: When exchanging date-time values as text, always use standard ISO 8601 formats rather than devise your own funky formats such as that seen in the Question. The java.time classes use these standard formats by default when parsing/generating strings.

like image 26
Basil Bourque Avatar answered Oct 06 '22 12:10

Basil Bourque