Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to convert T[,] to T[][]?

So it turns out all arrays are not created equal. Multi-dimensional arrays can have non-zero lower bounds. See for example Excel PIA's Range.Value property object[,] rectData = myRange.Value;

I need to convert these data into a jagged array. My first try below smells of complexity. Any suggestions to optimize it? It needs to handle the general case where lower bounds may not be zero.

I have this ex method:

    public static T[][] AsJagged<T>( this T[,] rect )
    {
        int row1 = rect.GetLowerBound(0);
        int rowN = rect.GetUpperBound(0);
        int col1 = rect.GetLowerBound(1);
        int colN = rect.GetUpperBound(1);

        int height = rowN - row1 + 1;
        int width = colN - col1 + 1;
        T[][] jagged = new T[height][];

        int k = 0;
        int l;
        for ( int i = row1; i < row1 + height; i++ )
        {
            l = 0;
            T[] temp = new T[width];
            for ( int j = col1; j < col1 + width; j++ )
                temp[l++] = rect[i, j];
            jagged[k++] = temp;
        }

        return jagged;
    }

Used like this:

    public void Foo()
    {
        int[,] iRect1 = { { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 } };
        int[][] iJagged1 = iRect1.AsJagged();

        int[] lengths = { 3, 5 };
        int[] lowerBounds = { 7, 8 };
        int[,] iRect2 = (int[,])Array.CreateInstance(typeof(int), lengths, lowerBounds);
        int[][] iJagged2 = iRect2.AsJagged();

    }

Curious if Buffer.BlockCopy() would work or be faster?

Edit: AsJagged needs to handle reference types.

Edit: Found bug in AsJagged(). Added int l; and added col1 + width to inner loop.

like image 375
dFlat Avatar asked Feb 20 '12 06:02

dFlat


People also ask

How do you convert KT to T?

1 Kilotonne: 1 Kilotonne or metric kiloton (unit of mass) is equal to 1000 metric tons.

What is the difference between 1 ton and 1 tonne?

Although they sound the same and both refer to a unit of mass, there is a difference between the words 'ton' and 'tonne' beyond just spelling: A ton is an imperial unit of mass equivalent to 1,016.047 kg or 2,240 lbs. A tonne is a metric unit of mass equivalent to 1,000 kg or 2,204.6 lbs.

Is ton and Mt same?

Ton: Lighter, 1 (short) ton = 907.18474 kg. Metric ton: Heavier, 1 metric ton= 1000 kg.

How many metric tonnes make a tonne?

1 ton = 0.907185 metric tons.


1 Answers

A view caveats/assumptions up front:

  • You seem to use only int as your data type (or at least seem to be OK with using Buffer.BlockCopy which would imply you can life with primitive types in general).
  • For the test data you show, I don't think there will be much different using any somewhat sane approach.

Having that said, the following implementation (which needs to be specialized for a specific primitive type (here int) because it uses fixed) is around 10 times faster than the approach using the inner loop:

    unsafe public static int[][] AsJagged2(int[,] rect)
    {
        int row1 = rect.GetLowerBound(0);
        int rowN = rect.GetUpperBound(0);
        int col1 = rect.GetLowerBound(1);
        int colN = rect.GetUpperBound(1);

        int height = rowN - row1 + 1;
        int width = colN - col1 + 1;
        int[][] jagged = new int[height][];

        int k = 0;
        for (int i = row1; i < row1 + height; i++)
        {
            int[] temp = new int[width];

            fixed (int *dest = temp, src = &rect[i, col1])
            {
                MoveMemory(dest, src, rowN * sizeof(int));
            }

            jagged[k++] = temp;
        }

        return jagged;
    }

    [DllImport("kernel32.dll", EntryPoint = "RtlMoveMemory")]
    unsafe internal static extern void MoveMemory(void* dest, void* src, int length);

Using the following "test code":

    static void Main(string[] args)
    {
        Random rand = new Random();
        int[,] data = new int[100,1000];
        for (int i = 0; i < data.GetLength(0); i++)
        {
            for (int j = 0; j < data.GetLength(1); j++)
            {
                data[i, j] = rand.Next(0, 1000);
            }
        }

        Stopwatch sw = Stopwatch.StartNew();

        for (int i = 0; i < 100; i++)
        {
            int[][] dataJagged = AsJagged(data);
        }

        Console.WriteLine("AsJagged:  " + sw.Elapsed);

        sw = Stopwatch.StartNew();

        for (int i = 0; i < 100; i++)
        {
            int[][] dataJagged2 = AsJagged2(data);
        }

        Console.WriteLine("AsJagged2: " + sw.Elapsed);
    }

Where AsJagged (the first case) is your original function, I get the following output:

AsJagged:  00:00:00.9504376
AsJagged2: 00:00:00.0860492

So there is indeed a faster way of doing it, however depending on the size of the test data, the number of times you actually perform this operation, and your willingness to allow unsafe and P/Invoke code, you're probably not going to need it.

Having that said, we were using large matrixes of double (say 7000x10000 elements) where it indeed did make a huge difference.

Update: about using Buffer.BlockCopy

I might overlook some Marshal or other trick, but I don't think using Buffer.BlockCopy is possible here. This is due to the fact that it requires both the source and destination array to, well, be an Array.

In our example, the destination is an array (e.g. int[] temp = ...) however the source is not. While we "know" that for two dimensional arrays of primitive types the layout is such, that each "row" (i.e. first dimension) is an array of the type in memory, there is no safe (as in unsafe) way to get that array without the overhead of copying it first. So we basically need to use a function that simply deals with memory and doesn't care about the actual content of it - like MoveMemory. BTW, the internal implementation of Buffer.BlockCopy does something similar.

like image 143
Christian.K Avatar answered Sep 30 '22 04:09

Christian.K