Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java DateTimeFormatterBuilder with optional pattern results in DateTimeParseException

Goal

Provide a flexible parser for LocalDate instances that can handle input in one of the following formats:

  • yyyy
  • yyyyMM
  • yyyyMMdd

Implementation Attempt

The following class attempts to handle both the first and the second pattern. Parsing works for the year input, but year + month results in the exception outlined below.

import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import java.time.format.DateTimeFormatterBuilder;
import java.time.temporal.ChronoField;

public class DateTest {

    public static void main(String[] args) {
        DateTimeFormatter parser = new DateTimeFormatterBuilder()
        .parseDefaulting(ChronoField.MONTH_OF_YEAR, 1)
        .parseDefaulting(ChronoField.DAY_OF_MONTH, 1)
        .appendPattern("yyyy")
        .optionalStart().appendPattern("MM").optionalEnd().toFormatter();

        System.out.println(parser.parse("2014", LocalDate::from)); // Works
        System.out.println(parser.parse("201411", LocalDate::from)); // Fails
    }
}

The second parse() attempt results in the following exception:

Exception in thread "main" java.time.format.DateTimeParseException: Text '201411' could not be parsed at index 0
at java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:1949)
at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1851)

I think my understanding of how optional partial patterns work is lacking. Is my goal of one parser with a flexible format even achievable, or do I need to check on input length and select from a list of parsers? As always, help is appreciated.

like image 878
Michael Schmid Avatar asked Jan 29 '23 05:01

Michael Schmid


2 Answers

The real cause of your problem is sign-handling. Your input has no sign but the parser element "yyyy" is greedy to parse as many digits as possible and expects a positive sign because there are more than four digits found.

My analysis was done in two different ways:

  • debugging (in order to see what is really behind the unclear error message)

  • simulating the behaviour in another parse engine based on my lib Time4J for getting a better error message:

    ChronoFormatter<LocalDate> cf =
    ChronoFormatter
        .ofPattern(
            "yyyy[MM]",
            PatternType.THREETEN,
            Locale.ROOT,
            PlainDate.axis(TemporalType.LOCAL_DATE)
        )
        .withDefault(PlainDate.MONTH_AS_NUMBER, 1)
        .withDefault(PlainDate.DAY_OF_MONTH, 1)
        .with(Leniency.STRICT);
    System.out.println(cf.parse("201411")); 
    // java.text.ParseException: Positive sign must be present for big number.
    

You could circumvent the problem by instructing the builder to always use only four digits for the year:

DateTimeFormatter parser =
    new DateTimeFormatterBuilder()
        .appendValue(ChronoField.YEAR, 4)
        .optionalStart()
        .appendPattern("MM[dd]")
        .optionalEnd()
        .parseDefaulting(ChronoField.MONTH_OF_YEAR, 1)
        .parseDefaulting(ChronoField.DAY_OF_MONTH, 1)
        .toFormatter();

System.out.println(parser.parse("2014", LocalDate::from)); // 2014-01-01
System.out.println(parser.parse("201411", LocalDate::from)); // 2014-11-01
System.out.println(parser.parse("20141130", LocalDate::from)); // 2014-11-30

Pay attention to the position of the defaulting elements in the builder. They are not called at the start but at the end because the processing of defaulting elements is unfortunately position-sensitive in java.time. And I have also added an extra optional section for the day of month inside the first optional section. This solution seems to be cleaner for me instead of using a sequence of 3 optional sections as suggested by Danila Zharenkov because latter one could also parse quite different inputs with many more digits (possible misuse of optional sections as replacement for or-patterns especially in lenient parsing).

About position-sensitive behaviour of defaulting elements here a citation from API-documentation:

During parsing, the current state of the parse is inspected. If the specified field has no associated value, because it has not been parsed successfully at that point, then the specified value is injected into the parse result. Injection is immediate, thus the field-value pair will be visible to any subsequent elements in the formatter. As such, this method is normally called at the end of the builder.


By the way: In my lib Time4J I can also define real or-patterns using the symbol "|" and then create this formatter:

ChronoFormatter<LocalDate> cf =
    ChronoFormatter
        .ofPattern(
            "yyyyMMdd|yyyyMM|yyyy",
            PatternType.CLDR,
            Locale.ROOT,
            PlainDate.axis(TemporalType.LOCAL_DATE)
        )
        .withDefault(PlainDate.MONTH_AS_NUMBER, 1)
        .withDefault(PlainDate.DAY_OF_MONTH, 1)
        .with(Leniency.STRICT);
like image 119
Meno Hochschild Avatar answered Feb 07 '23 19:02

Meno Hochschild


Here is the solution. You can define possible patterns inside appendPattern(). And to optional put defaults.

   DateTimeFormatter parser = new DateTimeFormatterBuilder()
            .appendPattern("[yyyy][yyyyMM][yyyyMMdd]")
            .optionalStart()
              .parseDefaulting(ChronoField.MONTH_OF_YEAR, 1)
              .parseDefaulting(ChronoField.DAY_OF_MONTH, 1)
            .optionalEnd()
            .toFormatter();
    System.out.println(parser.parse("2014",LocalDate::from)); // Works
    System.out.println(parser.parse("201411",LocalDate::from)); // Works
    System.out.println(parser.parse("20141102",LocalDate::from)); // Works

The output is

2014-01-01
2014-11-01
2014-11-02
like image 36
Danila Zharenkov Avatar answered Feb 07 '23 18:02

Danila Zharenkov