Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Build an ESDS Box for an MP4 that Firefox Can Play

I am generating MP4 files (with h.264 video and AAC audio) by transmuxing from MPEG-TS in JavaScript to be played in the browser via blob URLs. Everything works fine in Chrome, and if I grab the blob URLs out of the developer console and download them, the generated files play fine on Windows Media Player as well. Firefox, however, claims that they are corrupted.

I've narrowed the issue down to a problem with the ESDS box in the audio metadata. If I repackage the source MPEG-TS files by some other means (like ffmpeg), and hand-edit my generated files in a hex editor to paste in the ESDS box from the equivalent file generated by other software, then Firefox is happy.

My code that builds the ESDS box. (And I'm tracking the issue)

I attempted to write it by a pretty straightforward transcribe-stuff-from-the-MPEG-specs process, but that is no guarantee that I did not screw it up. Since Chrome and Windows Media play my files just fine, I'm not sure if it's actually an error in my file that they are somehow capable of ignoring, or if it's a problem with Firefox. I suspect the former, but I'm just not sure.

Anyone got any insight, or perhaps a straightforward, easy-to-understand reference for how to build a proper ESDS box?

EDIT: Here are some different ESDS sections produced for the same input file (as hex bytes, copied out of my hex editor):

Mine:

00 00 00 27 65 73 64 73 00 00 00 00 03 22 00 00
02 04 14 40 15 00 00 00 00 00 3a f1 00 00 2d e6
05 02 12 10 06 01 02

mpegts:

00 00 00 33 65 73 64 73 00 00 00 00 03 80 80 80
22 00 02 00 04 80 80 80 14 40 15 00 00 00 00 00
00 00 00 00 00 00 05 80 80 80 02 12 10 06 80 80
80 01 02

ffmpeg:

00 00 00 2c 65 73 64 73 00 00 00 00 03 80 80 80
1b 00 02 00 04 80 80 80 0d 40 15 00 00 00 00 01
5f 42 00 00 00 00 06 80 80 80 01 02

Oddly, and I did not notice this before, Firefox will play the video with ffmpeg's output, but neither Firefox nor Windows Media will actually play the sound (Chrome does). Firefox and Windows Media are both happy to play the video with sound using the output from mpegts, though. With mine, Chrome and Windows Media will play with video with sound, but Firefox doesn't play at all, and claims the video is corrupted.

like image 611
Logan R. Kearsley Avatar asked Oct 16 '25 17:10

Logan R. Kearsley


1 Answers

The 0x80 bytes do not belong to the tag before it, but to the length value after it. Version 2 of the ISO spec changed the interpretation of the length value so it can wrap more than 255 bytes by making it a 'VarInt32' type. The high bit in each byte denotes there is another length byte following, the lower 7 bits encode the value.

You could use this to encode arbitrary large values, but the ISO spec limits this to 4 bytes at most, or 0...2^(4*7)-1.

I.e.:

0x80,0x80,0x80,0x0E = 0x80,0x0E = 0x0E => 14
0x80,0x80,0x84,0x7f = 0x84,0x7f => 0x4 << 7 + 0x7f = 0x27f = 639

The same encoding is e.g. used by Googles protobuf, named Base128 Varint.

like image 94
StefanB Avatar answered Oct 18 '25 05:10

StefanB