iPhone combine audio files

Question

I have been searching for the answer and have not been able to find it -- obviously ;)

Here is what I have -- I have a list of words. Each word is saved as a wav file on the iPhone. In my app, the user will pick words and I would like to put the words together to make a sentence.

I have not been able to determine how to combine multiple wav files together, in sequence, to create the entire sentence as a single file.

I have figured out, through examples, how to play the files together as one file -- but the example mixes them -- I need to basically append them to each other. I have tried to append them to each other and remove the header information from all but the first file, but this process does not work. The file is the correct length, but only plays the content of the first file.

I think the correct path is to use AudioFileReadPacketData to read the files and AudioFileWritePacketData write the info into a new file. This has proven difficult...

Does anyone have experience with the Audio API's and could provide some sample code?

Ok -- more research into the matter... looks like the correct function is the Audio Queue Offline Render. There is some sample code provided by Apple (AQOfflineRenderTest). The reason for the Offline Render is because you are able to attach an output buffer to the render and save it to a file. More to come as I progress with the project...

Ok -- three days and no real progress....

I am trying to combine the three .wav files into the destination .wav file. Right now when you run this code, the first file is saved into the destination.

Any Ideas?

This source code uses the iPublicUtility classes provided from Apple -- they are able to be downloaded in several projects. One is the projects is aurioTouch.

Here is my code (place it in a .cpp file and reference the CombineAudioFiles in a normal Objective C source file):

// standard includes
#include <AudioToolbox/AudioQueue.h>
#include <AudioToolbox/AudioFile.h>
#include <AudioToolbox/ExtendedAudioFile.h>

// helpers
#include "CAXException.h"
#include "CAStreamBasicDescription.h"

#define kNumberOfBuffers 3
#define kMaxNumberOfFiles 3

// the application specific info we keep track of
struct AQTestInfo
{
    AudioFileID                     mAudioFile[kMaxNumberOfFiles];
    CAStreamBasicDescription        mDataFormat[kMaxNumberOfFiles];
    AudioQueueRef                   mQueue[kMaxNumberOfFiles];
    AudioQueueBufferRef             mBuffer[kNumberOfBuffers];
    UInt32                          mNumberOfAudioFiles;
    UInt32                          mCurrentAudioFile;
    UInt32                          mbufferByteSize;
    SInt64                          mCurrentPacket;
    UInt32                          mNumPacketsToRead;
    AudioStreamPacketDescription    *mPacketDescs;
    bool                            mFlushed;
    bool                            mDone;
};


#pragma mark- Helper Functions
// ***********************
// CalculateBytesForTime Utility Function

// we only use time here as a guideline
// we are really trying to get somewhere between 16K and 64K buffers, but not allocate too much if we don't need it
void CalculateBytesForTime (CAStreamBasicDescription &inDesc, UInt32 inMaxPacketSize, Float64 inSeconds, UInt32 *outBufferSize, UInt32 *outNumPackets)
{
    static const int maxBufferSize = 0x10000;   // limit size to 64K
    static const int minBufferSize = 0x4000;    // limit size to 16K

    if (inDesc.mFramesPerPacket) {
        Float64 numPacketsForTime = inDesc.mSampleRate / inDesc.mFramesPerPacket * inSeconds;
        *outBufferSize = numPacketsForTime * inMaxPacketSize;
    } else {
        // if frames per packet is zero, then the codec has no predictable packet == time
        // so we can't tailor this (we don't know how many Packets represent a time period
        // we'll just return a default buffer size
        *outBufferSize = maxBufferSize > inMaxPacketSize ? maxBufferSize : inMaxPacketSize;
    }

    // we're going to limit our size to our default
    if (*outBufferSize > maxBufferSize && *outBufferSize > inMaxPacketSize) {
        *outBufferSize = maxBufferSize;
    } else {
        // also make sure we're not too small - we don't want to go the disk for too small chunks
        if (*outBufferSize < minBufferSize) {
            *outBufferSize = minBufferSize;
        }
    }

    *outNumPackets = *outBufferSize / inMaxPacketSize;
}

#pragma mark- AQOutputCallback
// ***********************
// AudioQueueOutputCallback function used to push data into the audio queue

static void AQTestBufferCallback(void *inUserData, AudioQueueRef inAQ, AudioQueueBufferRef inCompleteAQBuffer) 
{
    AQTestInfo * myInfo = (AQTestInfo *)inUserData;
    if (myInfo->mDone) return;

    UInt32 numBytes;
    UInt32 nPackets = myInfo->mNumPacketsToRead;
    OSStatus result = AudioFileReadPackets(myInfo->mAudioFile[myInfo->mCurrentAudioFile],      // The audio file from which packets of audio data are to be read.
                                           false,                   // Set to true to cache the data. Otherwise, set to false.
                                           &numBytes,               // On output, a pointer to the number of bytes actually returned.
                                           myInfo->mPacketDescs,    // A pointer to an array of packet descriptions that have been allocated.
                                           myInfo->mCurrentPacket,  // The packet index of the first packet you want to be returned.
                                           &nPackets,               // On input, a pointer to the number of packets to read. On output, the number of packets actually read.
                                           inCompleteAQBuffer->mAudioData); // A pointer to user-allocated memory.
    if (result) {
        DebugMessageN1 ("Error reading from file: %d
", (int)result);
        exit(1);
    }

    // we have some data
    if (nPackets > 0) {
        inCompleteAQBuffer->mAudioDataByteSize = numBytes;

        result = AudioQueueEnqueueBuffer(inAQ,                                  // The audio queue that owns the audio queue buffer.
                                         inCompleteAQBuffer,                    // The audio queue buffer to add to the buffer queue.
                                         (myInfo->mPacketDescs ? nPackets : 0), // The number of packets of audio data in the inBuffer parameter. See Docs.
                                         myInfo->mPacketDescs);                 // An array of packet descriptions. Or NULL. See Docs.
        if (result) {
            DebugMessageN1 ("Error enqueuing buffer: %d
", (int)result);
            exit(1);
        }

        myInfo->mCurrentPacket += nPackets;

    } else {
        // **** This ensures that we flush the queue when done -- ensures you get all the data out ****

        if (!myInfo->mFlushed) {
            result = AudioQueueFlush(myInfo->mQueue[myInfo->mCurrentAudioFile]);

            if (result) {
                DebugMessageN1("AudioQueueFlush failed: %d", (int)result);
                exit(1);
            }

            myInfo->mFlushed = true;
        }

        result = AudioQueueStop(myInfo->mQueue[myInfo->mCurrentAudioFile], false);
        if (result) {
            DebugMessageN1("AudioQueueStop(false) failed: %d", (int)result);
            exit(1);
        }

        // reading nPackets == 0 is our EOF condition
        myInfo->mDone = true;
    }
}


// ***********************
#pragma mark- Main Render Function

#if __cplusplus
extern "C" {
#endif

void CombineAudioFiles(CFURLRef sourceURL1, CFURLRef sourceURL2, CFURLRef sourceURL3, CFURLRef destinationURL) 
{
    // main audio queue code
    try {
        AQTestInfo myInfo;

        myInfo.mDone = false;
        myInfo.mFlushed = false;
        myInfo.mCurrentPacket = 0;
        myInfo.mCurrentAudioFile = 0;

        // get the source file
        XThrowIfError(AudioFileOpenURL(sourceURL1, 0x01/*fsRdPerm*/, 0/*inFileTypeHint*/, &myInfo.mAudioFile[0]), "AudioFileOpen failed");
        XThrowIfError(AudioFileOpenURL(sourceURL2, 0x01/*fsRdPerm*/, 0/*inFileTypeHint*/, &myInfo.mAudioFile[1]), "AudioFileOpen failed");
        XThrowIfError(AudioFileOpenURL(sourceURL3, 0x01/*fsRdPerm*/, 0/*inFileTypeHint*/, &myInfo.mAudioFile[2]), "AudioFileOpen failed");

        UInt32 size = sizeof(myInfo.mDataFormat[myInfo.mCurrentAudioFile]);
        XThrowIfError(AudioFileGetProperty(myInfo.mAudioFile[myInfo.mCurrentAudioFile], kAudioFilePropertyDataFormat, &size, &myInfo.mDataFormat[myInfo.mCurrentAudioFile]), "couldn't get file's data format");

        printf ("File format: "); myInfo.mDataFormat[myInfo.mCurrentAudioFile].Print();

        // create a new audio queue output
        XThrowIfError(AudioQueueNewOutput(&myInfo.mDataFormat[myInfo.mCurrentAudioFile],   // The data format of the audio to play. For linear PCM, only interleaved formats are supported.
                                          AQTestBufferCallback,     // A callback function to use with the playback audio queue.
                                          &myInfo,                  // A custom data structure for use with the callback function.
                                          CFRunLoopGetCurrent(),    // The event loop on which the callback function pointed to by the inCallbackProc parameter is to be called.
                                                                    // If you specify NULL, the callback is invoked on one of the audio queue’s internal threads.
                                          kCFRunLoopCommonModes,    // The run loop mode in which to invoke the callback function specified in the inCallbackProc parameter. 
                                          0,                        // Reserved for future use. Must be 0.
                                          &myInfo.mQueue[myInfo.mCurrentAudioFile]),       // On output, the newly created playback audio queue object.
                                          "AudioQueueNew failed");

        UInt32 bufferByteSize;

        // we need to calculate how many packets we read at a time and how big a buffer we need
        // we base this on the size of the packets in the file and an approximate duration for each buffer
        {
            bool isFormatVBR = (myInfo.mDataFormat[myInfo.mCurrentAudioFile].mBytesPerPacket == 0 || myInfo.mDataFormat[myInfo.mCurrentAudioFile].mFramesPerPacket == 0);

            // first check to see what the max size of a packet is - if it is bigger
            // than our allocation default size, that needs to become larger
            UInt32 maxPacketSize;
            size = sizeof(maxPacketSize);
            XThrowIfError(AudioFileGetProperty(myInfo.mAudioFile[myInfo.mCurrentAudioFile], kAudioFilePropertyPacketSizeUpperBound, &size, &maxPacketSize), "couldn't get file's max packet size");

            // adjust buffer size to represent about a second of audio based on this format
            CalculateBytesForTime(myInfo.mDataFormat[myInfo.mCurrentAudioFile], maxPacketSize, 1.0/*seconds*/, &bufferByteSize, &myInfo.mNumPacketsToRead);

            if (isFormatVBR) {
                myInfo.mPacketDescs = new AudioStreamPacketDescription [myInfo.mNumPacketsToRead];
            } else {
                myInfo.mPacketDescs = NULL; // we don't provide packet descriptions for constant bit rate formats (like linear PCM)
            }

            printf ("Buffer Byte Size: %d, Num Packets to Read: %d
", (int)bufferByteSize, (int)myInfo.mNumPacketsToRead);
        }

        // if the file has a magic cookie, we should get it and set it on the AQ
        size = sizeof(UInt32);
        OSStatus result = AudioFileGetPropertyInfo (myInfo.mAudioFile[myInfo.mCurrentAudioFile], kAudioFilePropertyMagicCookieData, &size, NULL);

        if (!result && size) {
            char* cookie = new char [size];     
            XThrowIfError (AudioFileGetProperty (myInfo.mAudioFile[myInfo.mCurrentAudioFile], kAudioFilePropertyMagicCookieData, &size, cookie), "get cookie from file");
            XThrowIfError (AudioQueueSetProperty(myInfo.mQueue[myInfo.mCurrentAudioFile], kAudioQueueProperty_MagicCookie, cookie, size), "set cookie on queue");
            delete [] cookie;
        }

        // channel layout?
        OSStatus err = AudioFileGetPropertyInfo(myInfo.mAudioFile[myInfo.mCurrentAudioFile], kAudioFilePropertyChannelLayout, &size, NULL);
        AudioChannelLayout *acl = NULL;
        if (err == noErr && size > 0) {
            acl = (AudioChannelLayout *)malloc(size);
            XThrowIfError(AudioFileGetProperty(myInfo.mAudioFile[myInfo.mCurrentAudioFile], kAudioFilePropertyChannelLayout, &size, acl), "get audio file's channel layout");
            XThrowIfError(AudioQueueSetProperty(myInfo.mQueue[myInfo.mCurrentAudioFile], kAudioQueueProperty_ChannelLayout, acl, size), "set channel layout on queue");
        }

        //allocate the input read buffer
        XThrowIfError(AudioQueueAllocateBuffer(myInfo.mQueue[myInfo.mCurrentAudioFile], bufferByteSize, &myInfo.mBuffer[myInfo.mCurrentAudioFile]), "AudioQueueAllocateBuffer");

        // prepare a canonical interleaved capture format
        CAStreamBasicDescription captureFormat;
        captureFormat.mSampleRate = myInfo.mDataFormat[myInfo.mCurrentAudioFile].mSampleRate;
        captureFormat.SetAUCanonical(myInfo.mDataFormat[myInfo.mCurrentAudioFile].mChannelsPerFrame, true); // interleaved
        XThrowIfError(AudioQueueSetOfflineRenderFormat(myInfo.mQueue[myInfo.mCurrentAudioFile], &captureFormat, acl), "set offline render format");         

        ExtAudioFileRef captureFile;

        // prepare a 16-bit int file format, sample channel count and sample rate
        CAStreamBasicDescription dstFormat;
        dstFormat.mSampleRate = myInfo.mDataFormat[myInfo.mCurrentAudioFile].mSampleRate;
        dstFormat.mChannelsPerFrame = myInfo.mDataFormat[myInfo.mCurrentAudioFile].mChannelsPerFrame;
        dstFormat.mFormatID = kAudioFormatLinearPCM;
        dstFormat.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger; // little-endian
        dstFormat.mBitsPerChannel = 16;
        dstFormat.mBytesPerPacket = dstFormat.mBytesPerFrame = 2 * dstFormat.mChannelsPerFrame;
        dstFormat.mFramesPerPacket = 1;

        // create the capture file
        XThrowIfError(ExtAudioFileCreateWithURL(destinationURL, kAudioFileCAFType, &dstFormat, acl, kAudioFileFlags_EraseFile, &captureFile), "ExtAudioFileCreateWithURL");

        // set the capture file's client format to be the canonical format from the queue
        XThrowIfError(ExtAudioFileSetProperty(captureFile, kExtAudioFileProperty_ClientDataFormat, sizeof(AudioStreamBasicDescription), &captureFormat), "set ExtAudioFile client format");

        // allocate the capture buffer, just keep it at half the size of the enqueue buffer
        // we don't ever want to pull any faster than we can push data in for render
        // this 2:1 ratio keeps the AQ Offline Render happy
        const UInt32 captureBufferByteSize = bufferByteSize / 2;

        AudioQueueBufferRef captureBuffer;
        AudioBufferList captureABL;

        XThrowIfError(AudioQueueAllocateBuffer(myInfo.mQueue[myInfo.mCurrentAudioFile], captureBufferByteSize, &captureBuffer), "AudioQueueAllocateBuffer");

        captureABL.mNumberBuffers = 1;
        captureABL.mBuffers[0].mData = captureBuffer->mAudioData;
        captureABL.mBuffers[0].mNumberChannels = captureFormat.mChannelsPerFrame;

        // lets start playing now - stop is called in the AQTestBufferCallback when there's
        // no more to read from the file
        XThrowIfError(AudioQueueStart(myInfo.mQueue[myInfo.mCurrentAudioFile], NULL), "AudioQueueStart failed");

        AudioTimeStamp ts;
        ts.mFlags = kAudioTimeStampSampleTimeValid;
        ts.mSampleTime = 0;

        // we need to call this once asking for 0 frames
        XThrowIfError(AudioQueueOfflineRender(myInfo.mQueue[myInfo.mCurrentAudioFile], &ts, captureBuffer, 0), "AudioQueueOfflineRender");

        // we need to enqueue a buffer after the queue has started
        AQTestBufferCallback(&myInfo, myInfo.mQueue[myInfo.mCurrentAudioFile], myInfo.mBuffer[myInfo.mCurrentAudioFile]);

        while (true) {
            UInt32 reqFrames = captureBufferByteSize / captureFormat.mBytesPerFrame;

            XThrowIfError(AudioQueueOfflineRender(myInfo.mQueue[myInfo.mCurrentAudioFile], &ts, captureBuffer, reqFrames), "AudioQueueOfflineRender");

            captureABL.mBuffers[0].mData = captureBuffer->mAudioData;
            captureABL.mBuffers[0].mDataByteSize = captureBuffer->mAudioDataByteSize;
            UInt32 writeFrames = captureABL.mBuffers[0].mDataByteSize / captureFormat.mBytesPerFrame;

            printf("t = %.f: AudioQueueOfflineRender:  req %d fr/%d bytes, got %d fr/%d bytes
", ts.mSampleTime, (int)reqFrames, (int)captureBufferByteSize, writeFrames, (int)captureABL.mBuffers[0].mDataByteSize);

            XThrowIfError(ExtAudioFileWrite(captureFile, writeFrames, &captureABL), "ExtAudioFileWrite");

            if (myInfo.mFlushed) break;

            ts.mSampleTime += writeFrames;
        }

        CFRunLoopRunInMode(kCFRunLoopDefaultMode, 1, false);

        XThrowIfError(AudioQueueDispose(myInfo.mQueue[myInfo.mCurrentAudioFile], true), "AudioQueueDispose(true) failed");
        XThrowIfError(AudioFileClose(myInfo.mAudioFile[myInfo.mCurrentAudioFile]), "AudioQueueDispose(false) failed");
        XThrowIfError(ExtAudioFileDispose(captureFile), "ExtAudioFileDispose failed");

        if (myInfo.mPacketDescs) delete [] myInfo.mPacketDescs;
        if (acl) free(acl);
    }
    catch (CAXException e) {
        char buf[256];
        fprintf(stderr, "Error: %s (%s)
", e.mOperation, e.FormatError(buf));
    }

    return;
}


#if __cplusplus
}
#endif

iPhone Guy · Accepted Answer

Ok -- found an answer for this issue -- will only work with wav files...

I used NSData to concatenate each file into the master data. I then rewrote the header (first 44 bytes) according to the wav file specifications.

This process worked well... the most complex part of the process was to rewrite the header info... but once that was figured out, things work well using this process.

bpink · Answer

actually if you use mp3 you don't even need to rewrite the headers i've learned. you can

NSURL *soundFilePath = [[NSURL alloc] initFileURLWithPath: path1];
NSData *sound1Data = [[NSData alloc] initWithContentsOfURL: soundFilePath];
soundFilePath = [[NSURL alloc] initFileURLWithPath: path2];
NSData *sound2Data = [[NSData alloc] initWithContentsOfURL: soundFilePath];

NSMutableData *sounds = [NSMutableData alloc];
[sounds appendData:sound1Data];
[sounds appendData:sound2Data];

[[NSFileManager defaultManager] createFileAtPath:[NSTemporaryDirectory() stringByAppendingPathComponent:@"tmp.mp3"] contents:sounds attributes:nil]

and you're all set.

iPhone combine audio files

Tags:

iphone

iPhone Guy

2 Answers

iPhone Guy

bpink

Recent Activity

Donate For Us

iPhone combine audio files

Tags:

iphone

iPhone Guy

2 Answers

iPhone Guy

bpink

Related questions

Recent Activity

Donate For Us