Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data format from recording using Audio Queue framework

Tags:

iphone

I'm writing an iPhone app which should record the users voice, and feed the audio data into a library for modifications such as changing tempo and pitch. I started off with the SpeakHere example code from Apple:

http://developer.apple.com/library/ios/#samplecode/SpeakHere/Introduction/Intro.html

That project lays the groundwork for recording the user's voice and playing it back. It works well.

Now I'm diving into the code and I need to figure out how to feed the audio data into the SoundTouch library (http://www.surina.net/soundtouch/) to change the pitch. I became familiar with the Audio Queue framework while going through the code, and I found the place where I receive the audio data from the recording.

Essentially, you call AudioQueueNewInput to create a new input queue. You pass a callback function which is called every time a chunk of audio data is available. It is within this callback that I need to pass the chunks of data into SoundTouch.

I have it all setup, but the noise I play back from the SoundTouch library is very staticky (it barely resembles the original). If I don't pass it through SoundTouch and play the original audio it works fine.

Basically, I'm missing something about what the actual data I'm getting represents. I was assuming that I am getting a stream of shorts which are samples, 1 sample for each channel. That's how SoundTouch is expecting it, so it must not be right somehow.

Here is the code which sets up the audio queue so you can see how it is configured.

void AQRecorder::SetupAudioFormat(UInt32 inFormatID)
{
memset(&mRecordFormat, 0, sizeof(mRecordFormat));

UInt32 size = sizeof(mRecordFormat.mSampleRate);
XThrowIfError(AudioSessionGetProperty(kAudioSessionProperty_CurrentHardwareSampleRate,
                                          &size, 
                                          &mRecordFormat.mSampleRate), "couldn't get hardware sample rate");

size = sizeof(mRecordFormat.mChannelsPerFrame);
XThrowIfError(AudioSessionGetProperty(kAudioSessionProperty_CurrentHardwareInputNumberChannels, 
                                          &size, 
                                          &mRecordFormat.mChannelsPerFrame), "couldn't get input channel count");

mRecordFormat.mFormatID = inFormatID;
if (inFormatID == kAudioFormatLinearPCM)
{
    // if we want pcm, default to signed 16-bit little-endian
    mRecordFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
    mRecordFormat.mBitsPerChannel = 16;
    mRecordFormat.mBytesPerPacket = mRecordFormat.mBytesPerFrame = (mRecordFormat.mBitsPerChannel / 8) * mRecordFormat.mChannelsPerFrame;
    mRecordFormat.mFramesPerPacket = 1;
}
}

And here's part of the code which actually sets it up:

    SetupAudioFormat(kAudioFormatLinearPCM);

    // create the queue
    XThrowIfError(AudioQueueNewInput(
                                  &mRecordFormat,
                                  MyInputBufferHandler,
                                  this /* userData */,
                                  NULL /* run loop */, NULL /* run loop mode */,
                                  0 /* flags */, &mQueue), "AudioQueueNewInput failed");

And finally, here is the callback which handles new audio data:

void AQRecorder::MyInputBufferHandler(void *inUserData,
                                  AudioQueueRef inAQ,
                                  AudioQueueBufferRef inBuffer,
                                  const AudioTimeStamp *inStartTime,
                                  UInt32 inNumPackets,
                                  const AudioStreamPacketDescription *inPacketDesc) {
AQRecorder *aqr = (AQRecorder *)inUserData;
try {
        if (inNumPackets > 0) {
            CAStreamBasicDescription queueFormat = aqr->DataFormat();
            SoundTouch *soundTouch = aqr->getSoundTouch();

            soundTouch->putSamples((const SAMPLETYPE *)inBuffer->mAudioData,
                                   inBuffer->mAudioDataByteSize / 2 / queueFormat.NumberChannels());

            SAMPLETYPE *samples = (SAMPLETYPE *)malloc(sizeof(SAMPLETYPE) * 10000 * queueFormat.NumberChannels());
            UInt32 numSamples;
            while((numSamples = soundTouch->receiveSamples((SAMPLETYPE *)samples, 10000))) {
                // write packets to file
                XThrowIfError(AudioFileWritePackets(aqr->mRecordFile,
                                                    FALSE,
                                                    numSamples * 2 * queueFormat.NumberChannels(),
                                                    NULL,
                                                    aqr->mRecordPacket,
                                                    &numSamples,
                                                    samples),
                              "AudioFileWritePackets failed");
                aqr->mRecordPacket += numSamples;
            }
            free(samples);
        }

        // if we're not stopping, re-enqueue the buffe so that it gets filled again
        if (aqr->IsRunning())
            XThrowIfError(AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, NULL), "AudioQueueEnqueueBuffer failed");
} catch (CAXException e) {
    char buf[256];
    fprintf(stderr, "Error: %s (%s)\n", e.mOperation, e.FormatError(buf));
}
}

You can see that I'm passing the data in inBuffer->mAudioData to SoundTouch. In my callback, what exactly are the bytes representing, i.e. how do I extract samples from mAudioData?

like image 816
James Long Avatar asked Nov 15 '22 06:11

James Long


1 Answers

The default endianess of the Audio Queue may be the opposite of what you expect. You may have to swap upper and lower bytes of each 16-bit audio samples after record and before play.

sample_le = (0xff00 & (sample_be << 8)) | (0x00ff & (sample_be >> 8)) ;
like image 197
hotpaw2 Avatar answered Dec 28 '22 07:12

hotpaw2