This question is related to this other question @ SuperUser.
I want to download the TED Talks and the respective subtitles for offline viewing, for instance lets take this short talk by Richard St. John, the high-resolution video download URL is the following:
http://www.ted.com/talks/download/video/5118/talk/70
And the respective JSON encoded english subtitles can be downloaded at:
http://www.ted.com/talks/subtitles/id/70/lang/eng
Here is an except from the beginning of actual subtitle:
{
"captions": [{
"content": "This is really a two hour presentation I give to high school students,",
"startTime": 0,
"duration": 3000,
"startOfParagraph": false
}, {
"content": "cut down to three minutes.",
"startTime": 3000,
"duration": 1000,
"startOfParagraph": false
}, {
"content": "And it all started one day on a plane, on my way to TED,",
"startTime": 4000,
"duration": 3000,
"startOfParagraph": false
}, {
"content": "seven years ago."
And from the end of the subtitle:
{
"content": "Or failing that, do the eight things -- and trust me,",
"startTime": 177000,
"duration": 3000,
"startOfParagraph": false
}, {
"content": "these are the big eight things that lead to success.",
"startTime": 180000,
"duration": 4000,
"startOfParagraph": false
}, {
"content": "Thank you TED-sters for all your interviews!",
"startTime": 184000,
"duration": 2000,
"startOfParagraph": false
}]
}
I want to write an app that automatically downloads the high-resolution version of the video and all the available subtitles, but I'm having a really hard time since I have to convert the subtitle to a (VLC or any other decent video player) compatible format (.srt or .sub are my first choices) and I've no idea what the startTime
and duration
keys of the JSON file represent.
What I know so far is this:
startTime
starts at 0 with a duration
of 3000 = 3000
startTime
ends at 184000 with a duration
of 2000 = 186000
It may also be worthwhile noticing the following Javascript snippet:
introDuration:16500,
adDuration:4000,
postAdDuration:2000,
So my question is, what logic should I apply to convert startTime
and duration
values to a .srt compatible format:
1
00:01:30,200 --> 00:01:32,201
MEGA DENG COOPER MINE, INDIA
2
00:01:37,764 --> 00:01:39,039
Watch out, watch out!
Or to a .sub compatible format:
{FRAME_FROM}{FRAME_TO}This is really a two hour presentation I give to high school students,
{FRAME_FROM}{FRAME_TO}cut down to three minutes.
Can anyone help me out with this?
Ninh Bui nailed it, the formula is the following:
introDuration - adDuration + startTime ... introDuration - adDuration + startTime + duration
This approach allows to me convert directly to .srt format (no need to know length and FPS) in two ways:
00:00:12,500 --> 00:00:15,500
This is really a two hour presentation I give to high school students,
00:00:15,500 --> 00:00:16,500
cut down to three minutes.
And:
00:00:00,16500 --> 00:00:00,19500
And it all started one day on a plane, on my way to TED,
00:00:00,19500 --> 00:00:00,20500
seven years ago.
My guess would be that the times in the json are expressed in milliseconds, e.g. 1000 = 1 second. There is probably a maintimer, where startTime indicates the time on the timeline at which the subtitle should appear and the duration is probably the amount of time the subtitle should remain in vision. This theory is further affirmed by dividing 186000 / 1000 = 186 seconds = 186 / 60 = 3.1 minutes = 3 minutes and 6 seconds. The remaining seconds are probably applause ;-) With this information you should also be able to calculate from what frame to what frame you should apply your conversion to, i.e. you already know what the frames per second is so all you need to do is multiply the number of seconds of starttime with the FPS to get the begin frame. The end frame can be obtained by: (startTime + duration) * fps :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With