Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a more efficient way to convert double to float?

Tags:

performance

c#

I have a need to convert a multi-dimensional double array to a jagged float array. The sizes will var from [2][5] up to around [6][1024].

I was curious how just looping and casting the double to the float would perform and it's not TOO bad, about 225µs for a [2][5] array - here's the code:

const int count = 5;
const int numCh = 2;
double[,] dbl = new double[numCh, count];
float[][] flt = new float[numCh][];

for (int i = 0; i < numCh; i++)
{
    flt[i] = new float[count];
    for (int j = 0; j < count; j++)
    {
        flt[i][j] = (float)dbl[i, j];
    }
}

However if there are more efficient techniques I'd like to use them. I should mention that I ONLY timed the two nested loops, not the allocations before it.

After experimenting a little more I think 99% of the time is burned on the loops, even without the assignment!

like image 541
scubasteve Avatar asked Oct 06 '11 08:10

scubasteve


1 Answers

This will run faster, for small data it's not worth doing Parallel.For(0, count, (j) => it actually runs considerably slower for very small data, which is why that I have commented that section out.

double* dp0;
float* fp0;

fixed (double* dp1 = dbl)
{
    dp0 = dp1;

    float[] newFlt = new float[count];
    fixed (float* fp1 = newFlt)
    {
        fp0 = fp1;
        for (int i = 0; i < numCh; i++)
        {
            //Parallel.For(0, count, (j) =>
            for (int j = 0; j < count; j++)
            {
                fp0[j] = (float)dp0[i * count + j];
            }
            //});
            flt[i] = newFlt.Clone() as float[];
        }
     }
  }

This runs faster because double accessing double arrays [,] is really taxing in .NET due to the array bounds checking. the newFlt.Clone() just means we're not fixing and unfixing new pointers all the time (as there is a slight overhead in doing so)

You will need to run it with the unsafe code tag and compile with /UNSAFE

But really you should be running with data closer to 5000 x 5000 not 5 x 2, if something takes less than 1000 ms you need to either add in more loops or increase the data because at that level a minor spike in cpu activity can add a lot of noise to your profiling.

like image 79
Seph Avatar answered Oct 02 '22 16:10

Seph