Media Source Extension (MSE) needs fragmented mp4 for playback in the browser.
A fragmented MP4 file consists of the usual MovieBox with the TrackBoxes to signal which media streams are available. A Movie Extends Box (mvex) is used to signal that the movie is continued in the fragments. Another advantage is that fragments can be stored in different files.
A streaming media format based on Part 12 of the MPEG-4 standard (ISO base media file format). Unlike the older MPEG-2 Transport Stream (M2TS) format, which is used in Apple's streaming platform, fragmented MP4 (fMP4) does not multiplex the audio and video together.
The data in the MP4 file is divided into two sections, the first containing the media-related data and the second containing metadata. The media data contains audio or video and metadata indicate flags for random access, timestamps, etc. The structures in MP4 are typically referred to as atoms or boxes.
A fragmented MP4 contains a series of segments which can be requested individually if your server supports byte-range requests.
All MP4 files use an object oriented format that contains boxes aka atoms.
You can view a representation of the boxes in your MP4 using an online tool such as MP4 Parser or if you're using Windows, MP4 Explorer. Let's compare a normal MP4 with one that is fragmented:
This screenshot (from MP4 Parser) shows an MP4 that hasn't been fragmented and quite simply has one massive mdat
(Movie Data) box.
If we were building a video player that supports adaptive bitrate, we might need to know the byte position of the 10 sec mark in a 0.5Mbps and a 1Mbps file in order to switch the video source between the two files at that moment. Determining this exact byte position within one massive mdat
in each respective file is not trivial.
This screenshot shows a fragmented MP4 which has been segmented using MP4Box with the onDemand
profile.
You'll notice the sidx
and series of moof
+mdat
boxes. The sidx
is the Segment Index and stores meta data of the precise byte range locations of the moof
+mdat
segments.
Essentially, you can independently load the sidx
(its byte-range will be defined in the accompanying .mpd
Media Presentation Descriptor file) and then choose which segments you'd like to subsequently load and add to the MSE SourceBuffer.
Importantly, each segment is created at a regular interval of your choosing (ie. every 5 seconds), so the segments can have temporal alignment across files of different bitrates, making it easy to adapt the bitrate during playback.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With