The math behind Apple's Speak here example

Question

I have a question regarding the math that Apple is using in it's speak here example.

A little background: I know that average power and peak power returned by the AVAudioRecorder and AVAudioPlayer is in dB. I also understand why the RMS power is in dB and that it needs to be converted into amp using pow(10, (0.5 * avgPower)).

My question being:

Apple uses this formula to create it's "Meter Table"

MeterTable::MeterTable(float inMinDecibels, size_t inTableSize, float inRoot)
    : mMinDecibels(inMinDecibels),
    mDecibelResolution(mMinDecibels / (inTableSize - 1)), 
    mScaleFactor(1. / mDecibelResolution)
{
    if (inMinDecibels >= 0.)
    {
        printf("MeterTable inMinDecibels must be negative");
        return;
    }

    mTable = (float*)malloc(inTableSize*sizeof(float));

    double minAmp = DbToAmp(inMinDecibels);
    double ampRange = 1. - minAmp;
    double invAmpRange = 1. / ampRange;

    double rroot = 1. / inRoot;
    for (size_t i = 0; i < inTableSize; ++i) {
        double decibels = i * mDecibelResolution;
        double amp = DbToAmp(decibels);
        double adjAmp = (amp - minAmp) * invAmpRange;
        mTable[i] = pow(adjAmp, rroot);
    }
}

What are all the calculations - or rather, what do each of these steps do? I think that mDecibelResolution and mScaleFactor are used to plot 80dB range over 400 values (unless I'm mistaken). However, what's the significance of inRoot, ampRange, invAmpRange and adjAmp? Additionally, why is the i-th entry in the meter table "mTable[i] = pow(adjAmp, rroot);"?

Any help is much appreciated! :)

Thanks in advance and cheers!

codeBearer · Accepted Answer

It's been a month since I've asked this question, and thanks, Geebs, for your response! :)

So, this is related to a project that I've been working on, and the feature that is based on this was implemented about 2 days after asking that question. Clearly, I've slacked off on posting a closing response (sorry about that). I posted a comment on Jan 7, as well, but circling back, seems like I had a confusion with var names. >_<. Thought I'd give a full, line by line answer to this question (with pictures). :)

So, here goes:

//mDecibelResolution is the "weight" factor of each of the values in the meterTable.
//Here, the table is of size 400, and we're looking at values 0 to 399.
//Thus, the "weight" factor of each value is minValue / 399.


MeterTable::MeterTable(float inMinDecibels, size_t inTableSize, float inRoot)
    : mMinDecibels(inMinDecibels),
    mDecibelResolution(mMinDecibels / (inTableSize - 1)), 
    mScaleFactor(1. / mDecibelResolution)
{
    if (inMinDecibels >= 0.)
    {
        printf("MeterTable inMinDecibels must be negative");
        return;
    }

    //Allocate a table to store the 400 values
    mTable = (float*)malloc(inTableSize*sizeof(float));

    //Remember, "dB" is a logarithmic scale.
    //If we have a range of -160dB to 0dB, -80dB is NOT 50% power!!!
    //We need to convert it to a linear scale. Thus, we do pow(10, (0.05 * dbValue)), as stated in my question.

    double minAmp = DbToAmp(inMinDecibels);

    //For the next couple of steps, you need to know linear interpolation.
    //Again, remember that all calculations are on a LINEAR scale.
    //Attached is an image of the basic linear interpolation formula, and some simple equation solving.

Linear Interpolation Equation

    //As per the image, and the following line, (y1 - y0) is the ampRange - 
    //where y1 = maxAmp and y0 = minAmp.
    //In this case, maxAmp = 1amp, as our maxDB is 0dB - FYI: 0dB = 1amp.
    //Thus, ampRange = (maxAmp - minAmp) = 1. - minAmp
    double ampRange = 1. - minAmp;

    //As you can see, invAmpRange is the extreme right hand side fraction on our image's "Step 3"
    double invAmpRange = 1. / ampRange;

    //Now, if we were looking for different values of x0, x1, y0 or y1, simply substitute it in that equation and you're good to go. :)
    //The only reason we were able to get rid of x0 was because our minInterpolatedValue was 0.

    //I'll come to this later.
    double rroot = 1. / inRoot;

    for (size_t i = 0; i < inTableSize; ++i) {
        //Thus, for each entry in the table, multiply that entry with it's "weight" factor.
        double decibels = i * mDecibelResolution;

        //Convert the "weighted" value to amplitude using pow(10, (0.05 * decibelValue));
        double amp = DbToAmp(decibels);

        //This is linear interpolation - based on our image, this is the same as "Step 3" of the image.
        double adjAmp = (amp - minAmp) * invAmpRange;

        //This is where inRoot and rroot come into picture.
        //Linear interpolation gives you a "straight line" between 2 end-points.
       //rroot =  0.5
       //If I raise a variable, say myValue by 0.5, it is essentially taking the square root of myValue.
       //So, instead of getting a "straight line" response, by storing the square root of the value,
       //we get a curved response that is similar to the one drawn in the image (note: not to scale).
        mTable[i] = pow(adjAmp, rroot);
    }
}

Response Curve image: As you can see, the "Linear curve" is not exactly a curve. >_< Square root response image

Hope this helps the community in some way. :)

Geebs · Answer

No expert, but based on physics and math:

Assume the max amplitude is 1 and minimum is 0.0001 [corresponding to -80db, which is what min db value is set to in the apple example : #define kMinDBvalue -80.0 in AQLevelMeter.h]

minAmp is the minimum amplitude = 0.0001 for this example

Now, all that is being done is the amplitudes in multiples of the decibel resolution are being adjusted against the minimum amplitude:
adjusted amplitude = (amp-minamp)/(1-minamp)
This makes the range of the adjusted amplitude = 0 to 1 instead of 0.0001 to 1 (if that was desired).

inRoot is set to 2 here. rroot=1/2 - raising to power 1/2 is square root. from apple's file:
// inRoot - this controls the curvature of the response. 2.0 is square root, 3.0 is cube root. But inRoot doesn't have to be integer valued, it could be 1.8 or 2.5, etc.
Essentially gives you a response between 0 and 1 again, and the curvature of that varies based on what value you set for inRoot.

The math behind Apple's Speak here example

Tags:

ios

objective-c

core-audio

audio

avaudioplayer

codeBearer

2 Answers

codeBearer

Geebs

Recent Activity

Donate For Us

The math behind Apple's Speak here example

Tags:

ios

objective-c

core-audio

audio

avaudioplayer

codeBearer

2 Answers

codeBearer

Geebs

Related questions

Recent Activity

Donate For Us