Is there an easy way to parse HTTP date-strings in Python? According to the standard, there are several ways to format HTTP date strings; the method should be able to handle this.
In other words, I want to convert a string like "Wed, 23 Sep 2009 22:15:29 GMT" to a python time-structure.
Python has a built-in method to parse dates, strptime . This example takes the string “2020–01–01 14:00” and parses it to a datetime object. The documentation for strptime provides a great overview of all format-string options.
Using strptime() , date and time in string format can be converted to datetime type. The first parameter is the string and the second is the date time format specifier. One advantage of converting to date format is one can select the month or date or time individually.
>>> import email.utils as eut >>> eut.parsedate('Wed, 23 Sep 2009 22:15:29 GMT') (2009, 9, 23, 22, 15, 29, 0, 1, -1)
If you want a datetime.datetime
object, you can do:
def my_parsedate(text): return datetime.datetime(*eut.parsedate(text)[:6])
Since Python 3.3 there's email.utils.parsedate_to_datetime
which can parse RFC 5322 timestamps (aka IMF-fixdate
, Internet Message Format fixed length format, a subset of HTTP-date
of RFC 7231).
>>> from email.utils import parsedate_to_datetime ... ... s = 'Sun, 06 Nov 1994 08:49:37 GMT' ... parsedate_to_datetime(s) 0: datetime.datetime(1994, 11, 6, 8, 49, 37, tzinfo=datetime.timezone.utc)
There's also undocumented http.cookiejar.http2time
which can achieve the same as follows:
>>> from datetime import datetime, timezone ... from http.cookiejar import http2time ... ... s = 'Sun, 06 Nov 1994 08:49:37 GMT' ... datetime.utcfromtimestamp(http2time(s)).replace(tzinfo=timezone.utc) 1: datetime.datetime(1994, 11, 6, 8, 49, 37, tzinfo=datetime.timezone.utc)
It was introduced in Python 2.4 as cookielib.http2time
for dealing with Cookie Expires
directive which is expressed in the same format.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With