Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing DateTime in a "Myy" format

I need to parse DateTime in "Myy" format, so:

  • the first number is a month without leading zero (1 to 12), and
  • the second number is a year with two digits.

Examples:

115 -> January 2015
1016 -> October 2016

When using DateTime.ParseExact with "Myy" as a format, DateTime throws an exception when month is without leading zero.

This code throws an exception:

var date = DateTime.ParseExact("115", 
   "Myy", 
   CultureInfo.InvariantCulture); // throws FormatException

While this works fine:

var date = DateTime.ParseExact("1016", 
    "Myy", 
    CultureInfo.InvariantCulture); // works fine

MSDN Documentation clearly defines format specifiers:

  • "M" – The month, from 1 through 12.
  • "MM" – The month, from 01 through 12.
  • "yy" – The year, from 00 to 99.

Is there any format which would resolve the above case, i.e. "Myy" date time format in which month is without leading zeros?

EDIT

Just to precise: The question is about using format in ParseExact specifically and not about how to parse it itself by using string manipulation.

like image 315
Dariusz Woźniak Avatar asked May 21 '15 20:05

Dariusz Woźniak


1 Answers

This is because the DateTime parser reads from left to right without backtracking.

Since it tries to read a month, it starts taking the first two digits and uses it to parse the month. And then it tries to parse the year but there is only one digit left, so it fails. There is simply not a way to solve this without introducing a separation character:

DateTime.ParseExact("1 15", "M yy", CultureInfo.InvariantCulture)

If you can’t do that, read from the right first and split off the year separately (using string manipulation). Or just add a zero to the beginning and parse it as MMyy:

string s = "115";
if (s.Length < 4)
    s = "0" + s;
Console.WriteLine(DateTime.ParseExact(s, "MMyy", CultureInfo.InvariantCulture));

Research!

Since ispiro asked for sources: The parsing is done by the DateTimeParse type. Relevant for us is the ParseDigits method:

internal static bool ParseDigits(ref __DTString str, int digitLen, out int result) {
    if (digitLen == 1) {
        // 1 really means 1 or 2 for this call
        return ParseDigits(ref str, 1, 2, out result);
    }
    else {
        return ParseDigits(ref str, digitLen, digitLen, out result);
    }
}

Note that comment there in the case where digitLen equals 1. Know that the first number in that other ParseDigits overload is minDigitLen and the other is maxDigitLen. So basically, for a passed digitLen of 1, the function will also accept a maximum length of 2 (which makes it possible to use a single M to match the 2-digit months).

Now, the other overload that actually does the work contains this loop:

while (tokenLength < maxDigitLen) {
    if (!str.GetNextDigit()) {
        str.Index--;
        break;
    }
    result = result * 10 + str.GetDigit();
    tokenLength++;
}

As you can see, the method keeps taking more digits from the string until it exceeded the maximum digit length. The rest of the method is just error checking and stuff.

Finally, let’s look at the actual parsing in DoStrictParse. There, we have the following loop:

// Scan every character in format and match the pattern in str.
while (format.GetNext()) {
    // We trim inner spaces here, so that we will not eat trailing spaces when
    // AllowTrailingWhite is not used.
    if (parseInfo.fAllowInnerWhite) {
        str.SkipWhiteSpaces();
    }
    if (!ParseByFormat(ref str, ref format, ref parseInfo, dtfi, ref result)) {
       return (false);
    }
}

So basically, this loops over the characters in the format string, then tries to match the string from left to right using that format. ParseByFormat does additional logic that captures repeated formats (like yy instead of just y) and uses that information to branch into different formats. For our months, this is the relevant part:

if (tokenLen <= 2) {
    if (!ParseDigits(ref str, tokenLen, out tempMonth)) {
        if (!parseInfo.fCustomNumberParser ||
            !parseInfo.parseNumberDelegate(ref str, tokenLen, out tempMonth)) {
            result.SetFailure(ParseFailureKind.Format, "Format_BadDateTime", null);
            return (false);
        }
    }
}

So here we close the circle to the ParseDigits which is passed with a token length of 1 for a single M. But as we’ve seen above, it will still match two digits if it can; and all that without validating whether the two digit number it matches makes any sense for a month. So 130 wouldn’t match for January 2030 either. It would match as the 13th month and fail there later.

like image 108
poke Avatar answered Oct 22 '22 12:10

poke