Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Does Java's SimpleDateFormat parse this

Hi I've got a simple date format set up with a custom format string: MMddyy

and I give it the following value to parse: 4 1 01

I don't think it should parse this because of the spaces but the Simple Date Format is returning the date

April 4th 0001AD

any ideas why?

like image 859
Craig Warren Avatar asked Apr 06 '11 10:04

Craig Warren


2 Answers

This is expected behaviour - you are telling the DateFormat object to expect a 6 character String representation of a date and that is what you passed in. Spaces are parsed OK. However, if you used "4x1x01" you would get an error. Note that when parsing, leniency defaults to true e.g.

DateFormat df = new SimpleDateFormat("MMddyy");
Date date = df.parse("4 1 01"); // runs successfully (as you know)

DateFormat df = new SimpleDateFormat("MMddyy");
Date date = df.parse("41 01"); // 5 character String - runs successfully

DateFormat df = new SimpleDateFormat("MMddyy");
df.setLenient(false);
Date date = df.parse("41 01"); // 5 character String - causes exception

DateFormat df = new SimpleDateFormat("MMddyy");
Date date = df.parse("999999"); // 6 character String - runs successfully

DateFormat df = new SimpleDateFormat("MMddyy");
df.setLenient(false);
Date date = df.parse("999999"); // 6 character String - causes exception

When leniency is set to true (the default behaviour), the parse makes an effort to decipher invalid input e.g. the 35th day of a 31 day month becomes the 4th day of the next month.

like image 186
CodeClimber Avatar answered Sep 25 '22 02:09

CodeClimber


for parsing the size of a pattern (number of repeated characters) is not the expected size of the corresponding text. From the javadoc, for the different relevant presentation types:

  • Number: For parsing, the number of pattern letters is ignored unless it's needed to separate two adjacent fields.
  • Year: During parsing, only strings consisting of exactly two digits […] will be parsed into the default century. Any other numeric string, such as a one digit string, a three or more digit string, or a two digit string that isn't all digits (for example, "-1"), is interpreted literally. So "01/02/3" or "01/02/003" are parsed, using the same pattern
  • Month: If the number of pattern letters is 3 or more, the month is interpreted as text; otherwise, it is interpreted as a number.

The whitespace causes the parser to stop parsing the actual field (trailing spaces are not valid for numbers) and start with the next one. Since the pattern does not have a space between these two fields, it is not consumed and is part of the second field (leading spaces are valid). So the year got is not "exactly two digits" and will not be parsed into the default century.

Parsing tests (lenient set to false):

FORMAT   TEXT     RESULT (ISO yyyy-MM-dd)
-------------------------------------------------
dddyy    01011    2011-01-10  
dddyy    10 11    0011-01-10  (year is 3 chars: " 11")
dddyy    10 1     0001-01-10  (year is 2 char but not 2 digits: " 1")

dddy     01011    2011-01-10  ("y" same as "yy")

dd yy    10 11    2011-01-10  (ok, whitespace is consumed, year: "11")

d/y      3/4      0004-01-03  (year is not 2 digits)
d/y      3/04     2004-01-03  

M/d/y    4/6/11   2011-04-06
like image 44
user85421 Avatar answered Sep 27 '22 02:09

user85421