Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing a date with short month without dot

I have a String that represents a date in French locale : 09-oct-08 :

I need to parse that String so I came up with this SimpleDateFormat :

String format2 = "dd-MMM-yy";

But I have a problem with the month part, that seems to be expected with a ending dot :

df2.format(new Date());

gives me :

 28-oct.-09

What is now the best way for me to make SimpleDateFormat understand ("09-oct-08") ?

Full Code :

String format2 = "dd-MMM-yy"; 
DateFormat df2 = new SimpleDateFormat(format2,Locale.FRENCH); 
date = df2.parse("09-oct-08"); 

This gives me : java.text.ParseException: Unparseable date: "09-oct-08"

And if I then try to log :

df2.format(new Date()); 

I get : 28-oct.-09

like image 363
Jalil Avatar asked Oct 28 '09 07:10

Jalil


4 Answers

This seems to work:

    DateFormatSymbols dfsFr = new DateFormatSymbols(Locale.FRENCH);     String[] oldMonths = dfsFr.getShortMonths();     String[] newMonths = new String[oldMonths.length];     for (int i = 0, len = oldMonths.length; i < len; ++ i) {         String oldMonth = oldMonths[i];          if (oldMonth.endsWith(".")) {             newMonths[i] = oldMonth.substring(0, oldMonths[i].length() - 1);         } else {             newMonths[i] = oldMonth;         }     }     dfsFr.setShortMonths(newMonths);     DateFormat dfFr = new SimpleDateFormat(         "dd-MMM-yy", dfsFr);      // English date parser for creating some test data.     DateFormat dfEn = new SimpleDateFormat(         "dd-MMM-yy", Locale.ENGLISH);     System.out.println(dfFr.format(dfEn.parse("10-Oct-09")));     System.out.println(dfFr.format(dfEn.parse("10-May-09")));     System.out.println(dfFr.format(dfEn.parse("10-Feb-09"))); 

Edit: Looks like St. Shadow beat me to it.

like image 56
Jack Leow Avatar answered Sep 19 '22 15:09

Jack Leow


java.time

Let’s see if the java.time framework can help.

About java.time

The java.time framework built into Java 8 and later supplants the troublesome old java.util.Date/.Calendar classes. The new classes are inspired by the highly successful Joda-Time framework, intended as its successor, similar in concept but re-architected. Defined by JSR 310. Extended by the ThreeTen-Extra project. See the Tutorial.

LocalDate

Unlike the old classes, java.time offers the LocalDate class to represent a date-only value, with no time-of-day nor time zone.

French Abbreviations

Take a look at what the formatters in java.time expect for abbreviated month names in en Français.

We can loop the Month enum to get a list of months. This enum offers the getDisplayName method for generating a localized name of the month. This code demonstrates that the method produces the same output as the java.time formatter.

DateTimeFormatter formatter = DateTimeFormatter.ofPattern ( "dd-MMM-yyyy" ).withLocale ( Locale.FRENCH ); for ( Month month : Month.values () ) {     LocalDate localDate = LocalDate.of ( 2015 , month.getValue () , 1 );     String output = formatter.format ( localDate );     String displayName = month.getDisplayName ( TextStyle.SHORT , Locale.FRENCH );     System.out.println ( "output: " + output + " | displayName: " + displayName );// System.out.println ( "input: " + input + " → " + localDate + " → " + output ); } 
output: 01-janv.-2015 | displayName: janv. output: 01-févr.-2015 | displayName: févr. output: 01-mars-2015 | displayName: mars output: 01-avr.-2015 | displayName: avr. output: 01-mai-2015 | displayName: mai output: 01-juin-2015 | displayName: juin output: 01-juil.-2015 | displayName: juil. output: 01-août-2015 | displayName: août output: 01-sept.-2015 | displayName: sept. output: 01-oct.-2015 | displayName: oct. output: 01-nov.-2015 | displayName: nov. output: 01-déc.-2015 | displayName: déc. 

We find a mix of 3 and 4 letter spellings. Longer names are abbreviated to four characters plus a period (FULL STOP). Four months have names short enough to be used without abbreviation: mars, mai, juin, août.

So, as discussed in the other Answers, no simple solution.

Fix the Data Source

My first suggestion is to fix your data source. That source apparently fails to follow proper French rules for abbreviation. Yale agrees with Java 8’s understanding of French. By the way, if fixing your data source I strongly suggest using four-digit years as two lead to no end of confusion and ambiguity.

Fix the Input

Of course the source may be out of your control/influence. In that case, as with the other Answers, you may need to do a brute-force replace rather that attempt any cleverness. On the other hand, if the only problem with your input is simply missing the period (FULL STOP), then you could soft-code using the Month enum rather than hard-code the improper values.

I would make an initial parse attempt. Trap for the DateTimeParseException, before attempting a fix. If the exception is thrown, then fix the input.

To fix the input, try each month of the year by looping the possible set of enum instances. For each month, get its abbreviated name. Strip the period (FULL STOP) from that abbreviation, to match what we suspect is our improper incoming value. Test to see if that indeed is a match for the input. If not, go to next month.

When we do get a match, fix the input to be properly abbreviated for the Locale’s rules (French rules in our case). Then parse the fixed input. This would be our second parse attempt, as we made an initial attempt up top. If this second attempt fails, something is very wrong as noted in the FIXME: seen here. But normally this second parse attempt will succeed, and we can bail out of the for loop of the Month enum.

Finally, you could verify success by testing if the result is still the bogus flag value set initially (LocalDate.MIN).

String input = "09-oct-08"; // Last two digits are Year. DateTimeFormatter formatter = DateTimeFormatter.ofPattern ( "dd-MMM-yy" ).withLocale ( Locale.FRENCH ); LocalDate localDate = LocalDate.MIN; // Some folks prefer a bogus default value as a success/failure flag rather than using a NULL. try {     localDate = LocalDate.parse ( input , formatter ); } catch ( DateTimeParseException e ) {     // Look for any month name abbreviation improperly missing the period (FULL STOP).     for ( Month month : Month.values () ) {         String abbreviation = month.getDisplayName ( TextStyle.SHORT , Locale.FRENCH );         String abbreviationWithoutFullStop = abbreviation.replace ( "." , "" ); // Get short abbreviation, but drop any period (FULL STOP).         String proper = "-" + abbreviation + "-";         String improper = "-" + abbreviationWithoutFullStop + "-";         if ( input.contains ( improper ) ) {             String inputFixed = input.replace ( improper , proper );             try {                 localDate = LocalDate.parse ( inputFixed , formatter );             } catch ( DateTimeParseException e2 ) {                 // FIXME: Handle this error. We expected this second parse attempt to succeed.             }             break; // Bail-out of the loop as we got a hit, matching input with a particular improper value.         }     } } Boolean success =  ! ( localDate.equals ( LocalDate.MIN ) ); String formatted = formatter.format ( localDate );; String outputImproper = formatted.replace ( "." , "" );  // Drop any period (FULL STOP). 

Dump to console.

System.out.println ( "success: " + success + ". input: " + input + " → localDate: " + localDate + " → formatted: " + formatted + " → outputImproper: " + outputImproper ); 

success: true. input: 09-oct-08 → localDate: 2008-10-09 → formatted: 09-oct.-08 → outputImproper: 09-oct-08

like image 27
Basil Bourque Avatar answered Sep 18 '22 15:09

Basil Bourque


You can simply remove the ".":

df2.format(new Date()).replaceAll("\\.", ""));

Edit, regarding the lemon answer:

It seems to be a problem with the formatting when using the Locale French. Thus, I suggest that you simply use the . removal as I explained.

Indeed, the following code:

    String format2 = "dd-MMM-yy";
    Date date = Calendar.getInstance().getTime();
    SimpleDateFormat sdf = new SimpleDateFormat(format2, Locale.FRENCH);
    System.out.println(sdf.format(date));
    sdf = new SimpleDateFormat(format2, Locale.ENGLISH);
    System.out.println(sdf.format(date));

displays the following output:

28-oct.-09
28-Oct-09

Edit again

Ok, I got your problem right now.

I don't really know how you can solve this problem without processing your String first. The idea is to replace the month in the original String by a comprehensive month:

        String[] givenMonths = { "jan", "fév", "mars", "avr.", "mai", "juin", "juil", "août", "sept", "oct", "nov", "déc" };
        String[] realMonths = { "janv.", "févr.", "mars", "avr.", "mai", "juin", "juil.", "août", "sept.", "oct.", "nov.", "déc." };
        String original = "09-oct-08";
        for (int i = 0; i < givenMonths.length; i++) {
            original = original.replaceAll(givenMonths[i], realMonths[i]);
        }
        String format2 = "dd-MMM-yy";
        DateFormat df2 = new SimpleDateFormat(format2, Locale.FRENCH);
        Date date = df2.parse(original);
        System.out.println("--> " + date);

I agree, this is awful, but I don't see any other solution if you use to SimpleDateFormat and Date classes.

Another solution is to use a real date and time library instead of the original JDK ones, such as Joda Time.

like image 33
Romain Linsolas Avatar answered Sep 18 '22 15:09

Romain Linsolas


String format2 = "dd-MMM-yy";
Date date = Calendar.getInstance().getTime();
SimpleDateFormat sdf = new SimpleDateFormat(format2);
System.out.println(sdf.format(date));

Outputs 28-Oct-09

I don't see any dots sir. Have you tried re-checking your prints? Maybe you accidentally placed a . beside your MMM?


You're getting java.text.ParseException: Unparseable date: "09-oct-08" since "09-oct-08" does not match the formatting of Locale.FRENCH either use the default locale(US I think) or add a . beside your oct

like image 20
lemon Avatar answered Sep 19 '22 15:09

lemon