I'm trying to validate a string that's supposed to contain a timestamp in the format of ISO 8601 (commonly used in JSON).
Python's strptime
seems to be very forgiving when it comes to validating zero-padding, see code example below (note that the hour is missing a leading zero):
>>> import datetime
>>> s = '1985-08-23T3:00:00.000'
>>> datetime.datetime.strptime(s, '%Y-%m-%dT%H:%M:%S.%f')
datetime.datetime(1985, 8, 23, 3, 0)
It gracefully accepts a string that's not zero-padded for the hour for example, and doesn't throw a ValueError
exception as I would expect.
Is there any way to enforce strptime to validate that it's zero-padded? Or is there any other built-in function in the standard libs of Python that does?
I would like to avoid writing my own regexp
for this.
Method #1 : Using strptime() In this, the function, strptime usually used for conversion of string date to datetime object, is used as when it doesn't match the format or date, raises the ValueError, and hence can be used to compute for validity.
The date validation you want to achieve in python will largely depend on the format of the date you have. The strptime function from the datetime library can be used to parse strings to dates/times.
datetime module called . strptime that will convert (provided that we give it a format) the string to a datetime object. Then we can do a simple conversion. You'll need to be more specific about the algorithm if you want the time delta for all items.
There is already an answer that parsing ISO8601 or RFC3339 date/time with Python strptime() is impossible: How to parse an ISO 8601-formatted date? So, to answer you question, no there is no way in the standard Python library to reliable parse such a date. Regarding the regex suggestions, a date string like
2020-14-32T45:33:44.123
would result in a valid date. There are lots of Python modules (if you search for "iso8601" on https://pypi.python.org), but building a complete ISO8601 Validator would require things like leap seconds, the list of possible time zone offset values and many more.
You said you want to avoid a regex, but this is actually the type of problem where a regex is appropriate. As you discovered, strptime
is very flexible about the input it will accept. However, the regex for this problem is relatively easy to compose:
import re
date_pattern = re.compile(r'\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d{3}')
s_list = [
'1985-08-23T3:00:00.000',
'1985-08-23T03:00:00.000'
]
for s in s_list:
if date_pattern.match(s):
print "%s is valid" % s
else:
print "%s is invalid" % s
Output
1985-08-23T3:00:00.000 is invalid
1985-08-23T03:00:00.000 is valid
Try it on repl.it
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With