Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mixing datetime.strptime() arguments

It is quite a common mistake to mix up the datetime.strptime() format string and date string arguments using:

datetime.strptime("%B %d, %Y", "January 8, 2014")

instead of the other way around:

datetime.strptime("January 8, 2014", "%B %d, %Y")

Of course, it would fail during the runtime:

>>> datetime.strptime("%B %d, %Y", "January 8, 2014")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/_strptime.py", line 325, in _strptime
    (data_string, format))
ValueError: time data '%B %d, %Y' does not match format 'January 8, 2014'

But, is it possible to catch this problem statically even before actually running the code? Is it something pylint or flake8 can help with?


I've tried the PyCharm code inspection, but both snippets don't issue any warnings. Probably, because both arguments have the same type - they both are strings which makes the problem more difficult. We would have to actually analyze if a string is a datetime format string or not. Also, the Language Injections PyCharm/IDEA feature looks relevant.

like image 474
alecxe Avatar asked Jul 01 '16 14:07

alecxe


People also ask

What does datetime Strptime do in Python?

Python time strptime() function The strptime() function in Python is used to format and return a string representation of date and time. It takes in the date, time, or both as an input, and parses it according to the directives given to it.

What does Strptime mean in Python?

Description. Python time method strptime() parses a string representing a time according to a format. The return value is a struct_time as returned by gmtime() or localtime().

What is the difference between Strptime and Strftime?

strptime is short for "parse time" where strftime is for "formatting time". That is, strptime is the opposite of strftime though they use, conveniently, the same formatting specification.

Is Strptime aware a timezone?

When the %z directive is provided to the strptime() method, an aware datetime object will be produced. The tzinfo of the result will be set to a timezone instance.


1 Answers

I claim that this cannot be checked statically in the general case.

Consider the following snippet:

d = datetime.strptime(read_date_from_network(), read_format_from_file())

This code may be completely valid, where both read_date_from_network and read_format_from_file really do return strings of the proper format -- or they may be total garbage, both returning None or some crap. Regardless, that information can only be determined at runtime -- hence, a static checker is powerless.


What's more, given the current definition of datetime.strptime, even if we were using a statically typed language, we wouldn't be able to catch this error (except in very specific cases) -- the reason being that the signature of this function doomed us from the start:

classmethod datetime.strptime(date_string, format)

in this definition, date_string and format are both strings, even though they actually have special meaning. Even if we had something analogous in a statically typed language like this:

public DateTime strpTime(String dateString, String format)

The compiler (and linter and everyone else) still only sees:

public DateTime strpTime(String, String)

Which means that none of the following are distinguishable from each other:

strpTime("%B %d, %Y", "January 8, 2014") // strpTime(String, String) CHECK
strpTime("January 8, 2014", "%B %d, %Y") // strpTime(String, String) CHECK
strpTime("cat", "bat") // strpTime(String, String) CHECK

This isn't to say that it can't be done at all -- there do exist some linters for statically typed languages such as Java/C++/etc. that will inspect string literals when you pass them to some specific functions (like printf, etc.), but this can only be done when you're calling that function directly with a literal format string. The same linters become just as helpless in the first case that I presented, because it's simply not yet known if the strings will be the right format.

i.e. A linter may be able to warn about this:

// Linter regex-es the first argument, sees %B et. al., warns you
strpTime("%B %d, %Y", "January 8, 2014")

but it would not be able to warn about this:

strpTime(scanner.readLine(), scanner.readLine())

Now, the same could be engineered into a python linter, but I don't believe that it would be very useful because functions are first-class, so I could easily defeat the (hypothetical python) linter by writing:

f = datetime.strptime
d = f("January 8, 2014", "%B %d, %Y")

And then we're pretty much hosed again.


Bonus: What Went Wrong

The problem here is that the datetime.strptime gives implicit meaning to each of these strings, but it doesn't surface that information to the type system. What could have been done was to give the two strings differing types -- then there could have been more safety, albeit at the expense of some ease-of-use.

e.g (using PEP 484 type annotations, a real thing!):

class DateString(str):
  pass

class FormatString(str):
  pass

class datetime(date):
  ...
  def strptime(date_string: DateString, format: FormatString) -> datetime:
    # etc. etc.

Then it would start to be feasible to provide good linting in the general case -- though the DateString and FormatString classes would need to take care of validating their input, because again, the type system can't do anything at that level.


Afterword:

I think the best way to deal with this is to avoid the problem by using the strftime method, which is bound to a specific datetime object and takes just a format string argument. That circumvents the entire problem by giving us a function signature that doesn't cut us when we hug it. Yay.

like image 141
Dan Avatar answered Oct 29 '22 11:10

Dan