I want to use joda to parse datetime strings in emails. Unfortunately I get all kinds of different formats, for example
Wed, 19 Jan 2011 12:52:31 -0600
Wed, 19 Jan 2011 10:15:34 -0800 (PST)
Wed, 19 Jan 2011 20:03:48 +0000 (UTC)
Wed, 19 Jan 2011 17:02:08 -0600 (CST)
Fri, 21 Jan 2011 10:39:55 +0100 (CET)
Fri, 21 Jan 2011 17:50:42 -0500 (EST)
Wed, 06 Apr 2011 15:38:25 GMT
Thu, 7 Apr 2011 11:38:24 +0200
Fri, 8 Apr 2011 05:13:36 -0700 (MST)
20 Apr 2011 03:00:46 -0400
The code below catches most of the variants but not all (for example, when there are two spaces instead of one, when the comma is missing etc.). And it looks just awkward.
Is there a more elegant way to handle this? Please advise.
DateTimeParser[] parsers = {
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z").getParser(),
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z '(CET)'").getParser(),
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z '(CST)'").getParser(),
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z '(CEST)'").getParser(),
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z '(GMT)'").getParser(),
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z '(MST)'").getParser(),
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z '(PST)'").getParser(),
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z '(UTC)'").getParser(),
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z '(EST)'").getParser(),
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z '(EDT)'").getParser(),
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z '(CDT)'").getParser(),
};
DateTimeFormatter inputFormatter = new DateTimeFormatterBuilder().append(null, parsers).toFormatter();
try {
calendar = inputFormatter.withLocale(Locale.US).parseDateTime(date[0]);
}
catch(Exception e) {
System.out.println("problem with " + date[0]);
}
Outside of using Joda's DateTimeParser
yourself and essentially parsing the text yourself building up a valid DateTime (which i think would be a lot of work), i don't think there's really much wrong with your approach. I do think that you have too many formats though. I think your set of formats could be reduced to:
DateTimeParser[] parsers = {
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z").getParser(),
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss Z '('z')'").getParser(),
DateTimeFormat.forPattern("E, d MMM y HH:mm:ss z").getParser(),
DateTimeFormat.forPattern("dd MMM y HH:mm:ss Z").getParser(),
};
Z (Capital-Z) is the RFC 822 numeric timezone and small-z is the acronym for the timezone, like PDT, for example. This is still (on average) 2 exceptions thrown per parse request but if this doesn't need to be high-performance, that's probably not so bad.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With