Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting Values Across Byte Boundaries With Arbitrary Bit Positions and Lengths In C#

I am currently working on a network tool that needs to decode/encode a particular protocol that packs fields into dense bit arrays at arbitrary positions. For example, one part of the protocol uses 3 bytes to represent a number of different fields:

Bit Position(s)  Length (In Bits)    Type
0                1                   bool
1-5              5                   int
6-13             8                   int
14-22            9                   uint
23               1                   bool

As you can see, several of the fields span multiple bytes. Many (most) are also shorter than the built-in type that might be used to represent them, such as the first int field which is only 5 bits long. In these cases, the most significant bits of the target type (such as an Int32 or Int16) should be padded with 0 to make up the difference.

My problem is that I am having a difficult time processing this kind of data. Specifically, I am having a hard time figuring out how to efficiently get arbitrary length bit arrays, populate them with the appropriate bits from the source buffer, pad them to match the target type, and convert the padded bit arrays to the target type. In an ideal world, I would be able to take the byte[3] in the example above and call a method like GetInt32(byte[] bytes, int startBit, int length).

The closest thing in the wild that I've found is a BitStream class, but it appears to want individual values to line up on byte/word boundaries (and the half-streaming/half-indexed access convention of the class makes it a little confusing).

My own first attempt was to use the BitArray class, but that proved somewhat unwieldy. It's easy enough to stuff all the bits from the buffer into a large BitArray, transfer only the ones you want from the source BitArray to a new temporary BitArray, and then convert that into the target value...but it seems wrong, and very time consuming.

I am now considering a class like the following that references (or creates) a source/target byte[] buffer along with an offset and provides get and set methods for certain target types. The tricky part is that getting/setting values may span multiple bytes.

class BitField
{
    private readonly byte[] _bytes;
    private readonly int _offset;

    public BitField(byte[] bytes)
        : this(bytes, 0)
    {
    }

    public BitField(byte[] bytes, int offset)
    {
        _bytes = bytes;
        _offset = offset;
    }

    public BitField(int size)
        : this(new byte[size], 0)
    {
    }

    public bool this[int bit]
    {
        get { return IsSet(bit); }
        set { if (value) Set(bit); else Clear(bit); }
    }

    public bool IsSet(int bit)
    {
        return (_bytes[_offset + (bit / 8)] & (1 << (bit % 8))) != 0;
    }

    public void Set(int bit)
    {
        _bytes[_offset + (bit / 8)] |= unchecked((byte)(1 << (bit % 8)));
    }

    public void Clear(int bit)
    {
        _bytes[_offset + (bit / 8)] &= unchecked((byte)~(1 << (bit % 8)));
    }

    //startIndex = the index of the bit at which to start fetching the value
    //length = the number of bits to include - may be less than 32 in which case
    //the most significant bits of the target type should be padded with 0
    public int GetInt32(int startIndex, int length)
    {
        //NEED CODE HERE
    }

    //startIndex = the index of the bit at which to start storing the value
    //length = the number of bits to use, if less than the number of bits required
    //for the source type, precision may be lost
    //value = the value to store
    public void SetValue(int startIndex, int length, int value)
    {
        //NEED CODE HERE
    }

    //Other Get.../Set... methods go here
}

I am looking for any guidance in this area such as third-party libraries, algorithms for getting/setting values at arbitrary bit positions that span multiple bytes, feedback on my approach, etc. I included the class above for clarification and am not necessarily looking for code to fill it in (though I won't argue if someone wants to work it out!).

like image 263
daveaglick Avatar asked Jul 11 '11 17:07

daveaglick


2 Answers

As promised, here is the class I ended up creating for this purpose. It will wrap an arbitrary byte array at an optionally specified index and allowing reading/writing at the bit level. It provides methods for reading/writing arbitrary blocks of bits from other byte arrays or for reading/writing primitive values with user-defined offsets and lengths. It works very well for my situation and solves the exact question I asked above. However, it does have a couple shortcomings. The first is that it is obviously not greatly documented - I just haven't had the time. The second is that there are no bounds or other checks. It also currently requires the MiscUtil library to provide endian conversion. All that said, hopefully this can help solve or serve as a starting point for someone else with a similar use case.

internal class BitField
{
    private readonly byte[] _bytes;
    private readonly int _offset;
    private EndianBitConverter _bitConverter = EndianBitConverter.Big;

    public BitField(byte[] bytes)
        : this(bytes, 0)
    {
    }

    //offset = the offset (in bytes) into the wrapped byte array
    public BitField(byte[] bytes, int offset)
    {
        _bytes = bytes;
        _offset = offset;
    }

    public BitField(int size)
        : this(new byte[size], 0)
    {
    }

    //fill == true = initially set all bits to 1
    public BitField(int size, bool fill)
        : this(new byte[size], 0)
    {
        if (!fill) return;
        for(int i = 0 ; i < size ; i++)
        {
            _bytes[i] = 0xff;
        }
    }

    public byte[] Bytes
    {
        get { return _bytes; }
    }

    public int Offset
    {
        get { return _offset; }
    }

    public EndianBitConverter BitConverter
    {
        get { return _bitConverter; }
        set { _bitConverter = value; }
    }

    public bool this[int bit]
    {
        get { return IsBitSet(bit); }
        set { if (value) SetBit(bit); else ClearBit(bit); }
    }

    public bool IsBitSet(int bit)
    {
        return (_bytes[_offset + (bit / 8)] & (1 << (7 - (bit % 8)))) != 0;
    }

    public void SetBit(int bit)
    {
        _bytes[_offset + (bit / 8)] |= unchecked((byte)(1 << (7 - (bit % 8))));
    }

    public void ClearBit(int bit)
    {
        _bytes[_offset + (bit / 8)] &= unchecked((byte)~(1 << (7 - (bit % 8))));
    }

    //index = the index of the source BitField at which to start getting bits
    //length = the number of bits to get
    //size = the total number of bytes required (0 for arbitrary length return array)
    //fill == true = set all padding bits to 1
    public byte[] GetBytes(int index, int length, int size, bool fill)
    {
        if(size == 0) size = (length + 7) / 8;
        BitField bitField = new BitField(size, fill);
        for(int s = index, d = (size * 8) - length ; s < index + length && d < (size * 8) ; s++, d++)
        {
            bitField[d] = IsBitSet(s);
        }
        return bitField._bytes;
    }

    public byte[] GetBytes(int index, int length, int size)
    {
        return GetBytes(index, length, size, false);
    }

    public byte[] GetBytes(int index, int length)
    {
        return GetBytes(index, length, 0, false);
    }

    //bytesIndex = the index (in bits) into the bytes array at which to start copying
    //index = the index (in bits) in this BitField at which to put the value
    //length = the number of bits to copy from the bytes array
    public void SetBytes(byte[] bytes, int bytesIndex, int index, int length)
    {
        BitField bitField = new BitField(bytes);
        for (int i = 0; i < length; i++)
        {
            this[index + i] = bitField[bytesIndex + i];
        }
    }

    public void SetBytes(byte[] bytes, int index, int length)
    {
        SetBytes(bytes, 0, index, length);
    }

    public void SetBytes(byte[] bytes, int index)
    {
        SetBytes(bytes, 0, index, bytes.Length * 8);
    }

    //UInt16

    //index = the index (in bits) at which to start getting the value
    //length = the number of bits to use for the value, if less than required the value is padded with 0
    public ushort GetUInt16(int index, int length)
    {
        return _bitConverter.ToUInt16(GetBytes(index, length, 2), 0);
    }

    public ushort GetUInt16(int index)
    {
        return GetUInt16(index, 16);
    }

    //valueIndex = the index (in bits) of the value at which to start copying
    //index = the index (in bits) in this BitField at which to put the value
    //length = the number of bits to copy from the value
    public void Set(ushort value, int valueIndex, int index, int length)
    {
        SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
    }

    public void Set(ushort value, int index)
    {
        Set(value, 0, index, 16);
    }

    //UInt32

    public uint GetUInt32(int index, int length)
    {
        return _bitConverter.ToUInt32(GetBytes(index, length, 4), 0);
    }

    public uint GetUInt32(int index)
    {
        return GetUInt32(index, 32);
    }

    public void Set(uint value, int valueIndex, int index, int length)
    {
        SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
    }

    public void Set(uint value, int index)
    {
        Set(value, 0, index, 32);
    }

    //UInt64

    public ulong GetUInt64(int index, int length)
    {
        return _bitConverter.ToUInt64(GetBytes(index, length, 8), 0);
    }

    public ulong GetUInt64(int index)
    {
        return GetUInt64(index, 64);
    }

    public void Set(ulong value, int valueIndex, int index, int length)
    {
        SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
    }

    public void Set(ulong value, int index)
    {
        Set(value, 0, index, 64);
    }

    //Int16

    public short GetInt16(int index, int length)
    {
        return _bitConverter.ToInt16(GetBytes(index, length, 2, IsBitSet(index)), 0);
    }

    public short GetInt16(int index)
    {
        return GetInt16(index, 16);
    }

    public void Set(short value, int valueIndex, int index, int length)
    {
        SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
    }

    public void Set(short value, int index)
    {
        Set(value, 0, index, 16);
    }

    //Int32

    public int GetInt32(int index, int length)
    {
        return _bitConverter.ToInt32(GetBytes(index, length, 4, IsBitSet(index)), 0);
    }

    public int GetInt32(int index)
    {
        return GetInt32(index, 32);
    }

    public void Set(int value, int valueIndex, int index, int length)
    {
        SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
    }

    public void Set(int value, int index)
    {
        Set(value, 0, index, 32);
    }

    //Int64

    public long GetInt64(int index, int length)
    {
        return _bitConverter.ToInt64(GetBytes(index, length, 8, IsBitSet(index)), 0);
    }

    public long GetInt64(int index)
    {
        return GetInt64(index, 64);
    }

    public void Set(long value, int valueIndex, int index, int length)
    {
        SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
    }

    public void Set(long value, int index)
    {
        Set(value, 0, index, 64);
    }

    //Char

    public char GetChar(int index, int length)
    {
        return _bitConverter.ToChar(GetBytes(index, length, 2), 0);
    }

    public char GetChar(int index)
    {
        return GetChar(index, 16);
    }

    public void Set(char value, int valueIndex, int index, int length)
    {
        SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
    }

    public void Set(char value, int index)
    {
        Set(value, 0, index, 16);
    }

    //Bool

    public bool GetBool(int index, int length)
    {
        return _bitConverter.ToBoolean(GetBytes(index, length, 1), 0);
    }

    public bool GetBool(int index)
    {
        return GetBool(index, 8);
    }

    public void Set(bool value, int valueIndex, int index, int length)
    {
        SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
    }

    public void Set(bool value, int index)
    {
        Set(value, 0, index, 8);
    }

    //Single and double precision floating point values must always use the correct number of bits
    public float GetSingle(int index)
    {
        return _bitConverter.ToSingle(GetBytes(index, 32, 4), 0);
    }

    public void SetSingle(float value, int index)
    {
        SetBytes(_bitConverter.GetBytes(value), 0, index, 32);
    }

    public double GetDouble(int index)
    {
        return _bitConverter.ToDouble(GetBytes(index, 64, 8), 0);
    }

    public void SetDouble(double value, int index)
    {
        SetBytes(_bitConverter.GetBytes(value), 0, index, 64);
    }
}
like image 172
daveaglick Avatar answered Oct 19 '22 01:10

daveaglick


If your packets are always smaller than 8 or 4 bytes it would be easier to store each packet in an Int32 or Int64. The byte array only complicates things. You do have to pay attention to High-Endian vs Low-Endian storage.

And then, for a 3 byte package:

public static void SetValue(Int32 message, int startIndex, int length, int value)
{
   // we want lengthx1
   int mask = (1 << length) - 1;     
   value = value & mask;  // or check and throw

   int offset = 24 - startIndex - length;   // 24 = 3 * 8
   message = message | (value << offset);
}
like image 31
Henk Holterman Avatar answered Oct 19 '22 01:10

Henk Holterman