Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does this data YYYY-MM-DD regex fail in Java?

Tags:

java

regex

My first question and Im excited... I've lurked since go-live and love the site, however I apologize for any newbie errors, formatting, etc...

I'm attempting to validate the format of a string field that contains a date in Java. We will receive the date in a string, I will validate its format before parsing it into a real Date object. The format being passed in is in the YYYY-MM-DD format. However I'm stuck on one of my tests, if I pass in "1999-12-33" the test will fail (as it should with a day number 33) with this incomplete pattern:

((19|20)\\d{2})-([1-9]|0[1-9]|1[0-2])-([12][0-9]|3[01])

However as soon as I add the characters in bold below it passes the test (but should not)

((19|20)\\d{2})-([1-9]|0[1-9]|1[0-2])-(0[1-9]|[1-9]|[12][0-9]|3[01])

*additional note, I know I can change the 0[1-9]|[1-9] into 0?[1-9] but I wanted to break everything down to its most simple format to try and find why this isn't working.

Here is the scrap test I've put together to run through all the different date scenarios:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class scrapTest {
    public scrapTest() {
    }

    public static void main(String[] args) {

        scrapTest a = new scrapTest();
        boolean flag = a.verfiyDateFormat("1999-12-33");
    }   

    private boolean verfiyDateFormat(String dateStr){
        Pattern datePattern = Pattern.compile("((19|20)\\d{2})-([1-9]|0[1-9]|1[0-2])-(0[1-9]|[1-9]|[12][0-9]|3[01])");
        Matcher dateMatcher = datePattern.matcher(dateStr);
        if(!dateMatcher.find()){
            System.out.println("Invalid date format!!! -> " + dateStr);
            return false;
        }
        System.out.println("Valid date format.");
        return true;
    } 
}

Ive been programming for ~10 years but extremely new to Java, so please feel free to explain anything as elementary as you see fit.

like image 912
ProfessionalAmateur Avatar asked Mar 24 '10 15:03

ProfessionalAmateur


4 Answers

I think it's because you're using dateMatcher.find() rather than dateMatcher.matches(). The former looks for a match, the latter tries to match the entire string. See the API page. So basically the first 3 in 33 will match the [1-9] you just added and the second 3 will not be matched by anything, but the method still returns true.

like image 199
miorel Avatar answered Oct 02 '22 13:10

miorel


(0[1-9]|[1-9]|[12][0-9]|3[01])

the second case, [1-9], looks to be the part that's succeeding as you don't have a test for the end of the string.

It's matching 1999-12-3, not 1999-12-33

like image 27
Broam Avatar answered Oct 02 '22 12:10

Broam


How about using SimpleDateFormat made just for that?

Date d = new SimpleDateFormat("yyyy-MM-dd").parse(somestring);
if (d == null) {
    // somestring is not a Date
} else {
    // d is the Date
}

Docs for SimpleDateFormat

like image 28
Pablo Lalloni Avatar answered Oct 02 '22 13:10

Pablo Lalloni


Not really an answer to the question, but a suggestion: write a simpler regex, and then do numeric validation in Java, instead of in your regex:

(\\d{4})-(\\d{2})-(\\d{2})

Match this against your input, extract the relevant groups and convert to integers, then check the year, month, and day parts to ensure they're within an acceptable range.

like image 24
Sam Barnum Avatar answered Oct 02 '22 13:10

Sam Barnum