Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the leading structure in MP3 file a real frame?

I'm now doing some work with decoding MP3 files, but just have some basic knowledge about the MP3 file. I implement a simple decoder for MP3 these days. When comparing the decoded result with that of Maaate decoder, I encounter this problem.

My decoder extract one more frame than the Maaate decoder. After scrutinizing the result of a sample MP3 file, I find the first frame is abnormal. For my sample file, the first frame is 413 bytes long with frame header 0xfffb9064 different from all the other frames with 100 byte-length and header 0xfffb1064.

My question is: Is the first "frame" in the result a real frame? Is so, why it seems different from others? If not, what is this structure used for and how to distinguish it from others for both of them share the frame sync code 0xfff?

like image 332
Summer_More_More_Tea Avatar asked Apr 19 '11 01:04

Summer_More_More_Tea


People also ask

What is a frame in MP3?

Most people with a little knowledge in MP3 files know that the sound is divided into smaller parts and compressed with a psycoacoustic model. This smaller pieces of the audio is then put into something called 'frames', which is a little datablock with a header.

How are MP3 files structured?

File structureAn MP3 file is made up of MP3 frames, which consist of a header and a data block. This sequence of frames is called an elementary stream. Due to the "bit reservoir", frames are not independent items and cannot usually be extracted on arbitrary frame boundaries.

How many bytes is a MP3 frame?

The 144 represents total bytes-per-frame . MP3 files are generally encoded as MPEG-1 Layer 3. There are 1152 samples per frame in type Layer 3. 1152 samples / 8 bits-per-byte = 144 bytes total.

What is an MP3 file?

What is a MP3 file? The MP3 file extension is a widely used digital audio container that has revolutionized the music playback, sharing and distributing industry. This format is based on either MPEG-1 Audio Layer III or MPEG-2 Audio Layer III codec developed by the Moving Picture Experts Group (MPEG).

What is MP3 compression?

MP3 is a standard technology, and compressed audio file format stands for "MPEG Audio Layer-3," which is developed by MPEG (Moving Picture Experts Group). A file extension and compression method for compressing a sound sequence into a very small file while still retaining audio quality as comparad to a CD.


2 Answers

MP3 streams don't have a file header. It sounds a bit odd that you have just one frame at the beginning which is longer than the rest, but this is perfectly legal.

There's a quick description of the bits in the header at: http://www.datavoyage.com/mpgscript/mpeghdr.htm

In your case, both types of header share in common:

  • MPEG-1
  • Layer 3
  • Not protected
  • 44.1kHz
  • No padding
  • Not private
  • M/S joint stereo
  • No copyright
  • Original media
  • No emphasis

The first frame differs from the rest with:

  • 128kbit (resulting in the 417 byte frames minus 4 byte header)

The rest are:

  • 32kbit (resulting in 104 byte frames minus 4 byte headers)

There's a formula in that page for calculating frame size based on header: 144*bitrate/samplerate+padding.

I suspect the 128kbit first frame is an artifact (bug) of the encoder used to generate the sample. It's still a constant bitrate file at 32kbit after the first frame. Given that an MP3 decoder can't produce output until it has a few frames, and it won't suddenly hit a bump in bitrate half way through, this is unlikely to upset anything.

like image 145
John Ripley Avatar answered Sep 20 '22 11:09

John Ripley


The very first frame can be used as what is often called the "LAME Tag" (The generator's name does not have to be LAME, though).

There has been (and may still be) a way to create this tag in ffmpeg when the encoder doesn't yet know what the future data is going to be and so ffmpeg would simply use some defaults such as 128kbps instead of the speed defined in your MP3 data.

So whether you have CBR or VBR data can't be based on that frame.

To see whether you have such a Tag, print out the first 64 bytes at least (or use an Hex editor) and you should see the letters "Info" (CBR) or "Xing" (VBR) pretty close to the start (often around byte 0x24). The eyeD3 and ffprobe are capable of decoding this tag.

I have a page about the format here.

like image 38
Alexis Wilke Avatar answered Sep 23 '22 11:09

Alexis Wilke