For the sake of interest I want to convert video durations from YouTubes ISO 8601
to seconds. To future proof my solution, I picked a really long video to test it against.
The API provides this for its duration - "duration": "P1W2DT6H21M32S"
I tried parsing this duration with dateutil
as suggested in stackoverflow.com/questions/969285.
import dateutil.parser
duration = = dateutil.parser.parse('P1W2DT6H21M32S')
This throws an exception
TypeError: unsupported operand type(s) for +=: 'NoneType' and 'int'
What am I missing?
Python's built-in dateutil module only supports parsing ISO 8601 dates, not ISO 8601 durations. For that, you can use the "isodate" library (in pypi at https://pypi.python.org/pypi/isodate -- install through pip or easy_install). This library has full support for ISO 8601 durations, converting them to datetime.timedelta objects. So once you've imported the library, it's as simple as:
import isodate
dur = isodate.parse_duration('P1W2DT6H21M32S')
print(dur.total_seconds())
Works on python 2.7+. Adopted from a JavaScript one-liner for Youtube v3 question here.
import re
def YTDurationToSeconds(duration):
match = re.match('PT(\d+H)?(\d+M)?(\d+S)?', duration).groups()
hours = _js_parseInt(match[0]) if match[0] else 0
minutes = _js_parseInt(match[1]) if match[1] else 0
seconds = _js_parseInt(match[2]) if match[2] else 0
return hours * 3600 + minutes * 60 + seconds
# js-like parseInt
# https://gist.github.com/douglasmiranda/2174255
def _js_parseInt(string):
return int(''.join([x for x in string if x.isdigit()]))
# example output
YTDurationToSeconds(u'PT15M33S')
# 933
Handles iso8061 duration format to extent Youtube Uses up to hours
Here's my answer which takes 9000's regex solution (thank you - amazing mastery of regex!) and finishes the job for the original poster's YouTube use case i.e. converting hours, minutes, and seconds to seconds. I used .groups()
instead of .groupdict()
, followed by a couple of lovingly constructed list comprehensions.
import re
def yt_time(duration="P1W2DT6H21M32S"):
"""
Converts YouTube duration (ISO 8061)
into Seconds
see http://en.wikipedia.org/wiki/ISO_8601#Durations
"""
ISO_8601 = re.compile(
'P' # designates a period
'(?:(?P<years>\d+)Y)?' # years
'(?:(?P<months>\d+)M)?' # months
'(?:(?P<weeks>\d+)W)?' # weeks
'(?:(?P<days>\d+)D)?' # days
'(?:T' # time part must begin with a T
'(?:(?P<hours>\d+)H)?' # hours
'(?:(?P<minutes>\d+)M)?' # minutes
'(?:(?P<seconds>\d+)S)?' # seconds
')?') # end of time part
# Convert regex matches into a short list of time units
units = list(ISO_8601.match(duration).groups()[-3:])
# Put list in ascending order & remove 'None' types
units = list(reversed([int(x) if x != None else 0 for x in units]))
# Do the maths
return sum([x*60**units.index(x) for x in units])
Sorry for not posting higher up - still new here and not enough reputation points to add comments.
Isn't the video 1 week, 2 days, 6 hours 21 minutes 32 seconds long?
Youtube shows it as 222 hours 21 minutes 17 seconds; 1 * 7 * 24 + 2 * 24 + 6 = 222. I don't know where 17 seconds vs 32 seconds discrepancy comes from, though; can as well be a rounding error.
To my mind, writing a parser for that is not that hard. Unfortunately dateutil
does not seem to parse intervals, only datetime points.
Update:
I see that there's a package for this, but just as an example of regexp power, brevity, and incomprehensible syntax, here's a parser for you:
import re
# see http://en.wikipedia.org/wiki/ISO_8601#Durations
ISO_8601_period_rx = re.compile(
'P' # designates a period
'(?:(?P<years>\d+)Y)?' # years
'(?:(?P<months>\d+)M)?' # months
'(?:(?P<weeks>\d+)W)?' # weeks
'(?:(?P<days>\d+)D)?' # days
'(?:T' # time part must begin with a T
'(?:(?P<hours>\d+)H)?' # hourss
'(?:(?P<minutes>\d+)M)?' # minutes
'(?:(?P<seconds>\d+)S)?' # seconds
')?' # end of time part
)
from pprint import pprint
pprint(ISO_8601_period_rx.match('P1W2DT6H21M32S').groupdict())
# {'days': '2',
# 'hours': '6',
# 'minutes': '21',
# 'months': None,
# 'seconds': '32',
# 'weeks': '1',
# 'years': None}
I deliberately am not calculating the exact number of seconds from these data here. It looks trivial (see above), but really isn't. For instance, distance of 2 months from January 1st is 58 days (30+28) or 59 (30+29), depending on year, while from March 1st it's always 61 days. A proper calendar implementation should take all this into account; for a Youtube clip length calculation, it must be excessive.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With