Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CORDIC Arcsine implementation fails

Tags:

I have recently implemented a library of CORDIC functions to reduce the required computational power (my project is based on a PowerPC and is extremely strict in its execution time specifications). The language is ANSI-C.

The other functions (sin/cos/atan) work within accuracy limits both in 32 and in 64 bit implementations.

Unfortunately, the asin() function fails systematically for certain inputs.

For testing purposes I have implemented an .h file to be used in a simulink S-Function. (This is only for my convenience, you can compile the following as a standalone .exe with minimal changes)

Note: I have forced 32 iterations because I am working in 32 bit precision and the maximum possible accuracy is required.

Cordic.h:

#include <stdio.h>
#include <stdlib.h>

#define FLOAT32 float
#define INT32 signed long int
#define BIT_XOR ^

#define CORDIC_1K_32 0x26DD3B6A                  
#define MUL_32       1073741824.0F     /*needed to scale float -> int*/            
#define INV_MUL_32   9.313225746E-10F  /*needed to scale int -> float*/

INT32 CORDIC_CTAB_32 [] = {0x3243f6a8, 0x1dac6705, 0x0fadbafc, 0x07f56ea6, 0x03feab76, 0x01ffd55b, 0x00fffaaa, 0x007fff55, 
                           0x003fffea, 0x001ffffd, 0x000fffff, 0x0007ffff, 0x0003ffff, 0x0001ffff, 0x0000ffff, 0x00007fff, 
                           0x00003fff, 0x00001fff, 0x00000fff, 0x000007ff, 0x000003ff, 0x000001ff, 0x000000ff, 0x0000007f, 
                           0x0000003f, 0x0000001f, 0x0000000f, 0x00000008, 0x00000004, 0x00000002, 0x00000001, 0x00000000};

/* CORDIC Arcsine Core: vectoring mode */
INT32 CORDIC_asin(INT32 arc_in)
{
  INT32 k;
  INT32 d;
  INT32 tx;
  INT32 ty;
  INT32 x;
  INT32 y;
  INT32 z;

  x=CORDIC_1K_32;
  y=0;
  z=0;

  for (k=0; k<32; ++k)
  {
    d = (arc_in - y)>>(31);
    tx = x - (((y>>k) BIT_XOR d) - d);
    ty = y + (((x>>k) BIT_XOR d) - d);
    z += ((CORDIC_CTAB_32[k] BIT_XOR d) - d);

    x = tx; 
    y = ty; 
  }  
  return z; 
}

/* Wrapper function for scaling in-out of cordic core*/
FLOAT32 asin_wrap(FLOAT32 arc)
{
  return ((FLOAT32)(CORDIC_asin((INT32)(arc*MUL_32))*INV_MUL_32));
}

This can be called in a manner similar to:

#include "Cordic.h"
#include "math.h"

void main()
{
    y1 = asin_wrap(value_32); /*my implementation*/
    y2 = asinf(value_32);     /*standard math.h for comparison*/
}

The results are as shown: enter image description here

Top left shows the [-1;1] input over 2000 steps (0.001 increments), bottom left the output of my function, bottom right the standard output and top right the difference of the two outputs.

It is immediate to see that the error is not within 32 bit accuracy.

I have analysed the steps performed (and the intermediate results) by my code and it seems to me that at a certain point the value of y is "close enough" to the initial value of arc_in and what could be related to a bit-shift causes the solution to diverge.


My questions:

  • I am at a loss, is this error inherent in the CORDIC implementation or have I made a mistake in the implementation? I was expecting the decrease of accuracy near the extremes, but those spikes in the middle are quite unexpected. (the most notable ones are just beyond +/- 0.6, but even removed these there are more at smaller values, albeit not as pronounced)
  • If it is something part of the CORDIC implementation, are there known workarounds?

EDIT:

  • Since some comment mention it, yes, I tested the definition of INT32, even writing #define INT32 int32_T does not change the results by the slightest amount.

  • The computation time on the target hardware has been measured by hundreds of repetitions of block of 10.000 iterations of the function with random input in the validity range. The observed mean results (for one call of the function) are as follows: math.h asinf() 100.00 microseconds CORDIC asin() 5.15 microseconds
    (apparently the previous test had been faulty, a new cross-test has obtained no better than an average of 100 microseconds across the validity range)

  • I apparently found a better implementation. It can be downloaded in matlab version here and in C here. I will analyse more its inner workings and report later.