I need to parse the dates of the format "January 10th, 2010" in Java. How can I do this?
How to handle the ordinal indicators, the st
, nd
, rd
, or th
trailing the day number?
This works:
String s = "January 10th, 2010";
DateFormat dateFormat = new SimpleDateFormat("MMM dd yyyy");
System.out.println("" + dateFormat.parse(s.replaceAll("(?:st|nd|rd|th),", "")));
but you need to make sure you are using the right Locale
to properly parse the month name.
I know you can include general texts inside the SimpleDateFormat
pattern. However in this case the text is dependent on the info and is actually not relevant to the parsing process.
This is actually the simplest solution I can think of. But I would love to be shown wrong.
You can avoid the pitfalls exposed in one of the comments by doing something similar to this:
String s = "January 10th, 2010";
DateFormat dateFormat = new SimpleDateFormat("MMM dd yyyy");
System.out.println("" + dateFormat.parse(s.replaceAll("(?<= \\d+)(?:st|nd|rd|th),(?= \\d+$)", "")));
This will allow you to not match Jath,uary 10 2010
for example.
I should like to contribute the modern answer. Rather than the SimpleDateFormat
class used in the two top-voted answer today you should use java.time, the modern Java date and time API. It offers a couple of nice solutions.
We first define a formatter for parsing:
private static final DateTimeFormatter PARSING_FORMATTER = DateTimeFormatter.ofPattern(
"MMMM d['st']['nd']['rd']['th'], uuuu", Locale.ENGLISH);
Then we use it like this:
String dateString = "January 10th, 2010";
LocalDate date = LocalDate.parse(dateString, PARSING_FORMATTER);
System.out.println("Parsed date: " + date);
Output is:
Parsed date: 2010-01-10
The square brackets []
in the format pattern string enclose optional parts, and the single quotes enclose literal text. So d['st']['nd']['rd']['th']
means that there may be st
, nd
, rd
and/or th
after the day of month.
A couple of limitations with the approach above are
10st
and even 10stndrdth
.January 10stndrdth, 2010
).If you want better validation of the ordinal indicator or you want the possibility of formatting the date back into a string, you may build your formatter in this way:
private static final DateTimeFormatter FORMATTING_AND_PARSING_FORMATTER;
static {
Map<Long, String> ordinalNumbers = new HashMap<>(42);
ordinalNumbers.put(1L, "1st");
ordinalNumbers.put(2L, "2nd");
ordinalNumbers.put(3L, "3rd");
ordinalNumbers.put(21L, "21st");
ordinalNumbers.put(22L, "22nd");
ordinalNumbers.put(23L, "23rd");
ordinalNumbers.put(31L, "31st");
for (long d = 1; d <= 31; d++) {
ordinalNumbers.putIfAbsent(d, "" + d + "th");
}
FORMATTING_AND_PARSING_FORMATTER = new DateTimeFormatterBuilder()
.appendPattern("MMMM ")
.appendText(ChronoField.DAY_OF_MONTH, ordinalNumbers)
.appendPattern(", uuuu")
.toFormatter(Locale.ENGLISH);
}
This will parse the date string the same as the one above. Let’s also try it for formatting:
System.out.println("Formatted back using the same formatter: "
+ date.format(FORMATTING_AND_PARSING_FORMATTER));
Formatted back using the same formatter: January 10th, 2010
You can set nd
etc as literals in a SimpleDateFormat. You can define the four needed format and try them. Starting with th
first, because I guess this will occur more often. If it fails with ParseException
, try the next one. If all fail, throw the ParseException. The code here is just a concept. In real-life you may would not generate the formats new everytime and may think about thread-safety.
public static Date hoolaHoop(final String dateText) throws ParseException
{
ParseException pe=null;
String[] sss={"th","nd","rd","st"};
for (String special:sss)
{
SimpleDateFormat sdf=new SimpleDateFormat("MMMM d'"+special+",' yyyy");
try{
return sdf.parse(dateText);
}
catch (ParseException e)
{
// remember for throwing later
pe=e;
}
}
throw pe;
}
public static void main (String[] args) throws java.lang.Exception
{
String[] dateText={"January 10th, 2010","January 1st, 2010","January 2nd, 2010",""};
for (String dt:dateText) {System.out.println(hoolaHoop(dt))};
}
Output:
Sun Jan 10 00:00:00 GMT 2010
Fri Jan 01 00:00:00 GMT 2010
Sat Jan 02 00:00:00 GMT 2010
Exception in thread "main" java.text.ParseException: Unparseable date: ""
"th","nd","rd","st"
is of course only suitable for Locales with english language. Keep that in mind. In france, "re","nd"
etc I guess.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With