Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Array Bitwise Type Conversion in C#

Tags:

c++

arrays

c#

The following is a simple console application written in C++:

#include <iostream>
using namespace std;

int main()
{
    const __int32 length = 4;
    __int32 ints[length] = {1, 2, 3, 4 };
    __int32* intArray = ints;
    __int64* longArray = (__int64*)intArray;
    for (__int32 i = 0; i < length; i++) cout << intArray[i] << '\n';
    cout << '\n';
    for (__int32 i = 0; i < length / 2; i++) cout << longArray[i] << '\n';
    cout << '\n';
    cout << "Press any key to exit.\n";
    cin.get();
}

The program takes an array of 4 32-bit signed integers, and converts it to an array of 2 64-bit signed integers. It is highly efficient since the only operation was to cast the pointer to a different type.

In C#, the equivalent can be done by creating a new array of the target type, and copying the memory for the original array to the target array. This can be achieved very quickly with the use of the System.Runtime.InteropServices.Marshal class. However, this is vastly inefficient for larger arrays due to the overhead of copying many megabytes of data.

Additionally, there are situations where one would wish for two arrays of different unmanaged types to refer to the same location in memory. For instance, performing operations on one array and seeing changes in another.

To be clear, I want to convert the arrays bit by bit, not value by value. If that does not make sense, this was the output from the console:

1
2
3
4

8589934593
17179869187
like image 222
Brendan Lynn Avatar asked Dec 31 '25 15:12

Brendan Lynn


1 Answers

You can use Span<T> to do this without copying the array:

int[] source = { 1, 2, 3, 4 };

Span<long> dest = MemoryMarshal.Cast<int, long>(source.AsSpan());

foreach (var element in dest)
{
    Console.WriteLine(element); // Outputs 8589934593 and 17179869187
}

However if you must have the data as an array, you must end up making a copy.

If you can accept unsafe code, this is likely to be slightly faster (but probably not by so much as to make it worth using unsafe code):

int[] source = { 1, 2, 3, 4 };

unsafe
{
    fixed (int* p = source)
    {
        long* q = (long*)p;

        for (int i = 0; i < source.Length/2; i++)
        {
            Console.WriteLine(*q++);
        }
    }
}

Another approach (which is slower but will give you the data in a separate array) is to use Buffer.BlockCopy(). If you can pre-allocate and reuse the destination array, you can save the overhead of allocating the destination - but you still pay to copy all the data.

int[] source = { 1, 2, 3, 4 };
long[] dest = new long[source.Length/2];

Buffer.BlockCopy(source, 0, dest, 0, sizeof(int) * source.Length);

foreach (var element in dest)
{
    Console.WriteLine(element);
}

We should never make performance decisions without benchmarks, so let's try some:

[MemoryDiagnoser]
public class Benchmarks
{
    [Benchmark]
    public void BlockCopy()
    {
        viaBlockCopy();
    }

    static long viaBlockCopy()
    {
        Buffer.BlockCopy(source, 0, dest, 0, sizeof(int) * source.Length);

        long total = 0;

        for (int i = 0; i < dest.Length; ++i)
            total += dest[i];

        return total;
    }

    [Benchmark]
    public void Unsafe()
    {
        viaUnsafe();
    }

    static long viaUnsafe()
    {
        unsafe
        {
            fixed (int* p = source)
            {
                long* q = (long*)p;
                long* end = q + source.Length / 2;

                long total  = 0;

                while (q != end)
                    total += *q++;

                return total;
            }
        }
    }

    [Benchmark]
    public void Span()
    {
        viaSpan();
    }

    static long viaSpan()
    {
        Span<long> result = MemoryMarshal.Cast<int, long>(source.AsSpan());

        long total = 0;

        foreach (var element in result)
        {
            total += element;
        }

        return total;
    }

    static readonly int[]  source = Enumerable.Range(0, 1024 * 1024).ToArray();
    static readonly long[] dest   = new long[1024 * 1024/2];
}

Note that the BlockCopy() benchmark is reusing the dest buffer to avoid the overhead of creating an output array. If your code has to create an output buffer for each call, it would be significantly slower.

And the results:

|    Method |     Mean |   Error |  StdDev | Allocated |
|---------- |---------:|--------:|--------:|----------:|
| BlockCopy | 362.7 us | 3.53 us | 3.30 us |         - |
|    Unsafe | 108.6 us | 0.68 us | 0.57 us |         - |
|      Span | 134.4 us | 0.37 us | 0.33 us |         - |

You can make up your own mind whether unsafe code is worth the extra performance (personally, I avoid unsafe code altogether).

Also note that these benchmarks are including the time to iterate over all the elements of the result. If you omit that part then for the Span and unsafe methods you'll just end up measuring the tiny amount of time needed to "cast" the data.

For completeness, here's the times if you remove the total calculation from the benchmarks (note that the numbers are in nanoseconds rather than microseconds!):

|    Method |            Mean |         Error |        StdDev | Allocated |
|---------- |----------------:|--------------:|--------------:|----------:|
| BlockCopy | 108,173.6835 ns | 1,591.5239 ns | 1,328.9946 ns |         - |
|    Unsafe |       0.9529 ns |     0.0105 ns |     0.0088 ns |         - |
|      Span |       1.1429 ns |     0.0042 ns |     0.0033 ns |         - |

Now you can see why I added in the total calculations...

like image 70
Matthew Watson Avatar answered Jan 02 '26 04:01

Matthew Watson