Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast integer-only standard deviation?

Ages ago I had an extremely fast integer-only standard deviation function (in C) that would return values that were "reasonably" accurate using no division or multiplication, just shifts and adds. I have since lost that code, and Google has been unable to help me find anything like it, and my discrete math skills are a bit too rusty to re-derive it.

In my specific case, I have a list of 14-bit ADC values for which I want to very quickly compute a "close enough" standard deviation on an 8-bit processor that lacks floating point hardware.

Does this ring a bell with anyone?

like image 974
BobC Avatar asked Mar 23 '23 01:03

BobC


1 Answers

Based on the link that @KerrekSB provided, we can rework the algorithm to use only integer arithmetic.

uint32_t std_var (uint16_t a[], uint16_t n) {
    if (n == 0) return 0;
    uint64_t sum = 0;
    uint64_t sq_sum = 0;
    for(unsigned i = 0; i < n; ++i) {
       uint32_t ai = a[i];
       sum += ai;
       sq_sum += ai * ai;
    }
    uint64_t N = n;
    return (N * sq_sum - sum * sum) / (N * N);
}

To get the standard deviation, take the square root of the result. To implement the integer square root, you can choose one of the multitude of answers provided by:

  • Looking for an efficient integer square root algorithm for ARM Thumb2

The algorithm does not try to account for any rounding errors during the intermediate calculations, it simply assumes all the values will fit, so no rounding is needed. This is what allows for the formula to be presented in a straightforward manner. Optimized algorithms for single pass variance usually perform intermediate divisions in an attempt to compensate for errors due to cancellation. For example (this is from Wikipedia):

double std_var_stable (uint16_t a[], uint16_t n) {
    if (n == 0) return 0;
    unsigned i;
    double mean = 0;
    double M2 = 0;
    for(i = 0; i < n; ++i) {
       double delta = a[i] - mean;
       mean += delta / (i + 1);
       M2 += delta * (a[i] - mean);
    }
    return M2/n;
}

However, intermediate divisions is not a suitable nor desirable for an integer math only algorithm.

like image 155
jxh Avatar answered Apr 02 '23 19:04

jxh