Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I detect whether a WAV file has a 44 or 46-byte header?

Tags:

audio

wav

I've discovered it is dangerous to assume that all PCM wav audio files have 44 bytes of header data before the samples begin. Though this is common, many applications (ffmpeg for example), will generate wavs with a 46-byte header and ignoring this fact while processing will result in a corrupt and unreadable file. But how can you detect how long the header actually is?

Obviously there is a way to do this, but I searched and found little discussion about this. A LOT of audio projects out there assume 44 (or conversely, 46) depending on the authors own context.

like image 559
Matt J. Avatar asked Nov 15 '13 00:11

Matt J.


2 Answers

You should be checking all of the header data to see what the actual sizes are. Broadcast Wave Format files will contain an even larger extension subchunk. WAV and AIFF files from Pro Tools have even more extension chunks that are undocumented as well as data after the audio. If you want to be sure where the sample data begins and ends you need to actually look for the data chunk ('data' for WAV files and 'SSND' for AIFF).

As a review, all WAV subchunks conform to the following format:

Subchunk Descriptor (4 bytes)
    Subchunk Size (4 byte integer, little endian)
    Subchunk Data (size is Subchunk Size)

This is very easy to process. All you need to do is read the descriptor, if it's not the one you are looking for, read the data size and skip ahead to the next. A simple Java routine to do that would look like this:

//
// Quick note for people who don't know Java well:
// 'in.read(...)' returns -1 when the stream reaches
// the end of the file, so 'if (in.read(...) < 0)'
// is checking for the end of file.
//
public static void printWaveDescriptors(File file)
        throws IOException {
    try (FileInputStream in = new FileInputStream(file)) {
        byte[] bytes = new byte[4];

        // Read first 4 bytes.
        // (Should be RIFF descriptor.)
        if (in.read(bytes) < 0) {
            return;
        }

        printDescriptor(bytes);

        // First subchunk will always be at byte 12.
        // (There is no other dependable constant.)
        in.skip(8);

        for (;;) {
            // Read each chunk descriptor.
            if (in.read(bytes) < 0) {
                break;
            }

            printDescriptor(bytes);

            // Read chunk length.
            if (in.read(bytes) < 0) {
                break;
            }

            // Skip the length of this chunk.
            // Next bytes should be another descriptor or EOF.
            int length = (
                  Byte.toUnsignedInt(bytes[0])
                | Byte.toUnsignedInt(bytes[1]) << 8
                | Byte.toUnsignedInt(bytes[2]) << 16
                | Byte.toUnsignedInt(bytes[3]) << 24
            );
            in.skip(Integer.toUnsignedLong(length));
        }

        System.out.println("End of file.");
    }
}

private static void printDescriptor(byte[] bytes)
        throws IOException {
    String desc = new String(bytes, "US-ASCII");
    System.out.println("Found '" + desc + "' descriptor.");
}

For example here is a random WAV file I had:

Found 'RIFF' descriptor.
Found 'bext' descriptor.
Found 'fmt ' descriptor.
Found 'minf' descriptor.
Found 'elm1' descriptor.
Found 'data' descriptor.
Found 'regn' descriptor.
Found 'ovwf' descriptor.
Found 'umid' descriptor.
End of file.

Notably, here both 'fmt ' and 'data' legitimately appear in between other chunks because Microsoft's RIFF specification says that subchunks can appear in any order. Even some major audio systems that I know of get this wrong and don't account for that.

So if you want to find a certain chunk, loop through the file checking each descriptor until you find the one you're looking for.

like image 136
Radiodef Avatar answered Oct 24 '22 17:10

Radiodef


The trick is to look at the "Subchunk1Size", which is a 4-byte integer beginning at byte 16 of the header. In a normal 44-byte wav, this integer will be 16 [10, 0, 0, 0]. If it's a 46-byte header, this integer will be 18 [12, 0, 0, 0] or maybe even higher if there is extra extensible meta data (rare?).

The extra data itself (if present), begins in byte 36.

So a simple C# program to detect the header length would look like this:

static void Main(string[] args)
{
    byte[] bytes = new byte[4];
    FileStream fileStream = new FileStream(args[0], FileMode.Open, FileAccess.Read);
    fileStream.Seek(16, 0);
    fileStream.Read(bytes, 0, 4);
    fileStream.Close();
    int Subchunk1Size = BitConverter.ToInt32(bytes, 0);

    if (Subchunk1Size < 16)
        Console.WriteLine("This is not a valid wav file");
    else
        switch (Subchunk1Size)
        {
            case 16:
                Console.WriteLine("44-byte header");
                break;
            case 18:
                Console.WriteLine("46-byte header");
                break;
            default:
                Console.WriteLine("Header contains extra data and is larger than 46 bytes");
                break;
        }
}
like image 34
Matt J. Avatar answered Oct 24 '22 15:10

Matt J.