TL;DR: I want to read raw h264 streams from AVI/MP4 files, even broken/incomplete.
Almost every document about h264 tells me that it consists of NAL packets. Okay. Almost everywhere told to me that the packet should start with a signature like 00 00 01
or 00 00 00 01
. For example, https://stackoverflow.com/a/18638298/8167678, https://stackoverflow.com/a/17625537/8167678
The format of H.264 is that it’s made up of NAL Units, each starting with a start prefix of three bytes with the values 0x00, 0x00, 0x01 and each unit has a different type depending on the value of the 4th byte right after these 3 starting bytes. One NAL Unit IS NOT one frame in the video, each frame is made up of a number of NAL Units.
Okay.
I downloaded random_youtube_video.mp4 and strip out one frame from it:
ffmpeg -ss 10 -i random_youtube_video.mp4 -frames 1 -c copy pic.avi
And got:
Red part - this is part of AVI container, other - actual data.
As you can see, here I have 00 00 24 A9
instead of 00 00 00 01
This AVI file plays perfectly
I do same for mp4 container:
As you can see, here exact same bytes. This MP4 file plays perfectly
I try to strip out raw data:
ffmpeg -i pic.avi -c copy pic.h264
This file can't play in VLC or even ffmpeg, which produced this file, can't parse it:
I downloaded mp4 stream analyzer and got:
MP4Box
tells me:
Cannot find H264 start code
Error importing pic.h264: BitStream Not Compliant
It very hard to learn internals of h264, when nothing works.
So, I have questions:
UPDATE:
It seems bug in ffmpeg:
When I do double conversion:
ffmpeg -ss 10 -i random_youtube_video.mp4 -frames 1 -c copy pic.mp4
ffmpeg pic.mp4 -c copy pic.h264
But when I convert file directly:
ffmpeg -ss 10 -i random_youtube_video.mp4 -frames 1 -c copy pic.h264
I have NALs signatures and one extra NAL unit. Other bytes are same (selected).
This is bug?
UPDATE
Not, this is not bug, U must use option -bsf h264_mp4toannexb to save stream as "Annex B" format (with prefixes)
"I want to read raw h264 streams from AVI files, even broken/incomplete."
"Almost everywhere told to me that the packet should start with a signature like :
00 00 01
or00 00 00 01
""...As you can see, here I have
00 00 24 A9
instead of00 00 00 01
"
Your H264 is in AVCC format which means it uses data sizes (instead of data start codes). It is only Annex-B that will have your mentioned signature as start code.
You seek frames, not by looking for start codes, but instead you just do skipping by frame sizes to reach the final correct offset of a (requested) frame...
AVI processing :
Read size (four) bytes (32-bit integer, Little Endian).
Extract the next following bytes up to size amount.
This is your H.264 frame (in AVCC format), decode the bytes to view image.
To convert into Annex-B, try replacing first 4 bytes of H.264 frame bytes with 00 00 00 01
.
Consider your shown AVI bytes (see first picture) :
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 4C 49 53 54 BA 24 00 00 6D 6F 76 69 ....LISTº$..movi
30 30 64 63 AD 24 00 00 00 00 24 A9 65 88 84 27 00dc.$....$©eˆ„'
C7 11 FE B3 C7 83 08 00 08 2A 7B 6E 59 B5 71 E1 Ç.þ³Çƒ...*{nYµqá
E3 9C 0E 73 E7 10 50 00 18 E9 25 F7 AA 7D 9C 30 ãœ.sç.P..é%÷ª}œ0
E6 2F 0F 20 00 3A 64 AA CA 5E 4F CA FF AE 20 04 æ/. .:dªÊ^OÊÿ® .
07 81 40 00 48 00 0A 28 71 21 84 48 06 18 90 0C [email protected]..(q!„H....
31 14 57 9E 7A CD 63 A0 E0 9B 96 69 C5 18 AE F2 1.WžzÍc à›–iÅ.®ò
E6 07 02 29 01 20 10 70 A1 0F 8C BC 73 F0 78 FA æ..). .p¡.Œ¼sðxú
9E 1D E1 C2 BF 8C 62 CE CE AC 14 5A A4 E1 45 44 ž.á¿ŒbÎά.Z¤áED
38 38 85 DB 12 57 3E F6 E0 FB AE 03 04 21 62 8D 88…Û.W>öàû®..!b.
F6 F1 1E 37 1C A2 FF 75 1C F1 02 66 0C 92 07 06 öñ.7.¢ÿu.ñ.f.’..
15 7C 90 15 6F 7D FC BD 13 1E 2B 0C 14 3C 0C 00 .|..o}ü½..+..<..
B0 EA 6F 53 B4 98 D7 80 7A 68 3E 34 69 20 D2 FA °êoS´˜×€zh>4i Òú
F0 91 FC 75 C6 00 01 18 C0 00 3B 9A C5 E2 7D BF ð‘üuÆ...À.;šÅâ}¿
Some explanation :
Ignore leading multiple 00
bytes.
4C 49 53 54 D6 3C 00 00 6D 6F 76 69
including 30 30 64 63
= AVI "List" header.
AD 24 00 00
== decimal 9389
is AVI's own size of H264 item (must read in Little Endian).
Notice that the AVI bytes include...
- a note of item's total size (AD 24 00 00
... or reverse for Little Endian : 00 00 24 AD
)
- followed by item data (00 00 24 A9 65 88 84 27 ... etc ... C5 E2 7D BF
).
This size includes both the 4 bytes of the AVI's"size" entry + expected bytes length of the item's own bytes. Can be written simply as:
AVI_Item_Size = ( 4 + item_H264_Frame.length );
H.264 video frame bytes in AVI :
Next follows the item data, which is the H.264 video frame. By sheer coincidence of formats/bytes layout, it too holds a 4-byte entry for data's size (since your H264 is in AVCC format, if it was Annex-B then you would be seeing start code bytes here instead of size bytes).
Unlike AVI bytes, these H264 size bytes are written in Big Endian format.
00 00 24 A9
= size of bytes for this video frame (instead of start code : 00 00 00 01
).
65 88 84 27 C7 11 FE B3 C7
= H.264 keyframe (always begins X5, where the X
value is based on other settings).
Remember after four size bytes (or even start codes) if followed by...
X5
= keyframe (IDR), example byte 65
.X1
= P or B frame, example byte 41
.X6
= SEI (Supplemental Enhancement Information).X7
= SPS (Sequence Parameter Set).X8
= PPS (Picture Parameter Set).00 00 00 X9
= Access unit delimiter.You can find the H.264 if you search for exact same bytes within AVI file. See third picture, these are your H.264 bytes (they are cut & pasted into the AVI container).
Sometimes a frame is sliced into different NAL units. So if you extract a key frame and it only shows 1/2 or 1/3 instead of full image, just grab next one or two NAL and re-try the decode.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With