Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way of reading and writing binary

Tags:

c#

file-io

binary

I'm currently optimizing an application, one of the operations that is done very often is reading and writing binary. I need 2 types of functions:

Set(byte[] target, int index, int value);

int Get(byte[] source, int index);

These functions are needed for signed and unsigned short, int and long in big and little endian order.

Here are some examples i've made, but i need a evaluation about the advantages and disadvantages:

first method is using Marshal to write the value into the memory of the byte[], the second is using plain pointers to accomplish this and the third uses BitConverter and BlockCopy to do this

unsafe void Set(byte[] target, int index, int value)
{
    fixed (byte* p = &target[0])
    {
        Marshal.WriteInt32(new IntPtr(p), index, value);
    }
}

unsafe void Set(byte[] target, int index, int value)
{
    int* p = &value;
    for (int i = 0; i < 4; i++)
    {
        target[offset + i] = *((byte*)p + i);
    }
}

void Set(byte[] target, int index, int value)
{
    byte[] data = BitConverter.GetBytes(value);
    Buffer.BlockCopy(data, 0, target, index, data.Length);
}

And here are the Read/Get methods:

the first is using Marshal to read the value from the byte[], the second is using plain pointers and the third is using BitConverter again:

unsafe int Get(byte[] source, int index)
{
    fixed (byte* p = &source[0])
    {
        return Marshal.ReadInt32(new IntPtr(p), index);
    }
}

unsafe int Get(byte[] source, int index)
{
    fixed (byte* p = &source[0])
    {
        return *(int*)(p + index);
    }
}

unsafe int Get(byte[] source, int index)
{
    return BitConverter.ToInt32(source, index);
}

boundary checking needs to be done but isn't part of my question yet...

I would be pleased if someone can tell what would be the best and fastest way in this case or give me some other solutions to work on. A generic solution would be preferable


I Just did some performance testing, here are the results:

Set Marshal: 45 ms, Set Pointer: 48 ms, Set BitConverter: 71 ms Get Marshal: 45 ms, Get Pointer: 26 ms, Get BitConverter: 30 ms

it seems that using pointers is the fast way, but i think Marshal and BitConverter do some internal checking... can someone verify this?

like image 761
haze4real Avatar asked Jan 10 '10 10:01

haze4real


Video Answer


1 Answers

Important: if you only need the one endian, see the pointer magic by wj32 / dtb


Personally, I would be writing directly to a Stream (perhaps with some buffering), and re-using a shared buffer that I can generally assume is clean. Then you can make some shortcuts and assume index 0/1/2/3.

Certainly don't use BitConverter, as that can't be used for both little/big-endian, which you require. I would also be inclined to just use bit-shifting rather than unsafe etc. It is actally the fastest, based on the following (so I'm glad that this is how I already do it my code here, look for EncodeInt32Fixed):

Set1: 371ms
Set2: 171ms
Set3: 993ms
Set4: 91ms <==== bit-shifting ;-p

code:

using System;
using System.Diagnostics;
using System.Runtime.InteropServices;
static class Program
{
    static void Main()
    {
        const int LOOP = 10000000, INDEX = 100, VALUE = 512;
        byte[] buffer = new byte[1024];
        Stopwatch watch;

        watch = Stopwatch.StartNew();
        for (int i = 0; i < LOOP; i++)
        {
            Set1(buffer, INDEX, VALUE);
        }
        watch.Stop();
        Console.WriteLine("Set1: " + watch.ElapsedMilliseconds + "ms");

        watch = Stopwatch.StartNew();
        for (int i = 0; i < LOOP; i++)
        {
            Set2(buffer, INDEX, VALUE);
        }
        watch.Stop();
        Console.WriteLine("Set2: " + watch.ElapsedMilliseconds + "ms");

        watch = Stopwatch.StartNew();
        for (int i = 0; i < LOOP; i++)
        {
            Set3(buffer, INDEX, VALUE);
        }
        watch.Stop();
        Console.WriteLine("Set3: " + watch.ElapsedMilliseconds + "ms");

        watch = Stopwatch.StartNew();
        for (int i = 0; i < LOOP; i++)
        {
            Set4(buffer, INDEX, VALUE);
        }
        watch.Stop();
        Console.WriteLine("Set4: " + watch.ElapsedMilliseconds + "ms");

        Console.WriteLine("done");
        Console.ReadLine();
    }
    unsafe static void Set1(byte[] target, int index, int value)
    {
        fixed (byte* p = &target[0])
        {
            Marshal.WriteInt32(new IntPtr(p), index, value);
        }
    }

    unsafe static void Set2(byte[] target, int index, int value)
    {
        int* p = &value;
        for (int i = 0; i < 4; i++)
        {
            target[index + i] = *((byte*)p + i);
        }
    }

    static void Set3(byte[] target, int index, int value)
    {
        byte[] data = BitConverter.GetBytes(value);
        Buffer.BlockCopy(data, 0, target, index, data.Length);
    }
    static void Set4(byte[] target, int index, int value)
    {
        target[index++] = (byte)value;
        target[index++] = (byte)(value >> 8);
        target[index++] = (byte)(value >> 16);
        target[index] = (byte)(value >> 24);
    }
}
like image 108
Marc Gravell Avatar answered Sep 22 '22 16:09

Marc Gravell