I tried to search for information regarding this, but did not find any adequate answer.
I conducted a study where I changed the playback rate of videos using the JavaScript playbackRate
attribute to .5
and 2
, from the natural playback rate.
I would like to know how Javascript changes video playback rate.
For e.g. when the playback rate changes to 2
, does it only result in frames being dropped from the videos?
What happens when playback rate is slowed down to .5
? Are frames being added in that case? How are the frames added? How is the audio stretched out/ trimmed?
The behavior of the video tag in html5 depends upon the user's browser as well as their OS and/or the device used to view the video.
Different browsers use different audio/video codecs to decode audio/video data. Here's a list of video formats and codecs in different browsers (and platforms):
https://developer.mozilla.org/en-US/docs/Web/HTML/Supported_media_formats
You will notice that while codecs can be different between browsers (IE vs Firefox vs Chrome vs Safari vs Opera), they can also be different in the desktop and mobile versions of browsers. Decoders can be based on the OS (for example, iOS only supports H.264 decoders). Some browsers allow users to adding decoders via browser add-ons.
The basic point is that the way playback controls behave (including speeding-up or slowing-down) will depend upon the behavior of the codec as well. When you mention "JavaScript changing the playback rate", the browser just asks the underlying video codec to change the playback rate for the video.
It seems a part of your question was "If I speed up the video to twice the original speed, will it drop half of my frames?", or "If I slow down the video to half the original speed, will it duplicate every frame?"
The answer largely depends upon the content of the video and the type of encoding that was used to create the video. While it is reasonable to think of a video as a series of frames (or images), some frames in a video are more important than others. Most videos are represented as I, P and B frames. I frames, also known as key frames contain most of the information while P and B frames contain difference information with respect to the I frame.
For example, consider a video sequence where there is a car moving in a largely static background. The first frame representing this sequence will contain an image of the entire scene but most of the following frames will only contain information about the moving car. The rest of the scene is static - so the information can be represented as difference images.
When speeding up or slowing down a video, codecs try to preserve the most information in the video sequences being displayed. So if you speed up the video, they will prioritize showing key frames and other information heavy frames as opposed to frames where the scene is mostly static. So, while there is a chance that half of the frames in the video will be dropped when you speed up the video to twice the speed, you cannot exactly predict which frames will be dropped. It is not like the 2nd, 4th, 6th, 8th frames are all going to be dropped while all the odd numbered frames will remain.
Here is a good article written by a Firefox developer back in Dec 2014 about Firefox's implementation of video playback rate:
http://blog.pearce.org.nz/2014/12/firefox-video-playbacks-skip-to-next.html
There they mention that Firefox wanted to do a better job of audio speed-up than video speed-up as people were more sensitive to glitches in audio than video. In videos, they would try to "skip to the next key frame", that is, drop all the frames between two key frames if video decoding could not be done on time.
I believe their implementation must have evolved within the past couple of years but this article gives you a good idea of the complexity of playback rate manipulation. Also, if you are trying to closely correlate some video moments with their respective audio sounds, changing playback speed could be tricky.
Another point to keep in mind is when accessing video data on the browser, mostly data does not live on a local drive - the video file is accessed over an unreliable network with asynchronous loading. So codecs have a lot of built-in optimizations to show video data even when there are poor connections. The basic idea is to show frames or parts of frames with the most relevant information while discarding the others. This again ties into the concept that it is hard to predict which frames will be dropped when you speed up the video or even the assumption that whole frames are going to be dropped may not be correct.
The calculation that half of the frames will be dropped if the video is played at twice the speed assumes that the video has a constant frame rate. Variable frame rate videos are dealt with differently with respect to playback rate manipulation. Then, it is more of a question of the total data rate (how much data should be displayed per sec) as opposed how many frames need to be dropped/added per sec.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With