Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is a bit field any more efficient (computationally) than masking bits and extracting the data by hand?

I have a numerous small pieces of data that I want to be able to shove into one larger data type. Let's say that, hypothetically, this is a date and time. The obvious method is via a bit field like this.

struct dt
{
    unsigned long minute :6;
    unsigned long hour :5;
    unsigned long day :5;
    unsigned long month :4;
    unsigned long year :12;
}stamp;

Now let's pretend that this thing is ordered so that things declared first are at bits of higher significance than things declared later so if I represent the bits by the first letter of the variable it would look like:

mmmmmm|hhhhh|ddddd|mmmm|yyyyyyyyyyyy

Finally, let's pretend that I simply declare an unsigned long and split it up using masks and shifts to do the same things.

unsigned long dateTime;

Here is my question:
Are the following to methods of accessing minutes, hours, etc. equivalent in terms of what the computer needs to do? Or is there some tricksy method that the compiler/computer uses with the bit fields.

unsigned minutes = stamp.minutes;
//versus
unsigned minutes = ((dateTime & 0xf8000000)>>26;

and

unsigned hours = stamp.hours;
//versus
unsigned hours = ((dateTime & 0x07C00000)>>21;

etc.

like image 251
James Matta Avatar asked Sep 27 '09 17:09

James Matta


4 Answers

The compiler generates the same instructions that you would explicitly write to access the bits. So don't expect it to be faster with bitfields.

In fact, strictly speaking with bitfields you don't control how they are positioned in the word of data (unless your compiler gives you some additional guarantees. I mean that the C99 standard doesn't define any). Doing masks by hand, you can at least place the two most often accessed fields first and last in the series, because in these two positions, it takes one operation instead of two to isolate the field.

like image 56
Pascal Cuoq Avatar answered Oct 07 '22 10:10

Pascal Cuoq


Those will probably compile the the same machine code, but if it really matters, benchmark it. Or, better yet, just use the bitfield because it's easier!

Quickly testing gcc yields:

shrq    $6, %rdi             ; using bit field
movl    %edi, %eax
andl    $31, %eax

vs.

andl    $130023424, %edi     ; by-hand
shrl    $21, %edi
movl    %edi, %eax

This is a little-endian machine, so the numbers are different, but the three instructions are nearly same.

like image 6
derobert Avatar answered Oct 07 '22 09:10

derobert


In this examle I would use the bit field manually.
But not because of accesses. But because of the ability to compare two dt's.
In the end the compiler will always generate better code than you (as the compiler will get better over time and never make mistakes) but this code is simple enough that you will probably write optimum code (but this is the kind of micro optimization you should not be worrying about).

If your dt is an integer formatted as:

yyyyyyyyyyyy|mmmm|ddddd|hhhhh|mmmmmm

Then you can naturally compare them like this.

dt t1(getTimeStamp());
dt t2(getTimeStamp());

if (t1 < t2)
{    std::cout << "T1 happened before T2\n";
}

By using a bit field structure the code looks like this:

dt t1(getTimeStamp());
dt t2(getTimeStamp());

if (convertToInt(t1) < convertToInt(t2))
{    std::cout << "T1 happened before T2\n";
}
// or
if ((t1.year < t2.year)
    || ((t1.year == t2.year) && ((t1.month < t2.month)
      || ((t1.month == t2.month) && ((t1.day < t2.day)
        || ((t1.day == t2.day) && (t1.hour  etc.....

Of course you could get the best of both worlds by using a union that has the structure on one side and the int as the alternative. Obviously this will depend exactly on how your compiler works and you will need to test that the objects are getting placed in the correct positions (but this would be perfect place to learn about TDD.

like image 4
Martin York Avatar answered Oct 07 '22 10:10

Martin York


It is entirely platform and compiler dependent. Some processors, especially microcontrollers, have bit addressing instructions or bit addressable memory, and the compiler can use these directly if you use built-in language constructs. If you use bit-masking to operate on bits on such a processor, the compiler will have to be smarter to spot the potential optimisation.

On most desktop platforms I would suggest that you are sweating the small stuff, but if you need to know, you should test it by profiling or timing the code, or analyse the generated code. Note that you may get very different results depending on compiler optimisation options, and even different compilers.

like image 4
Clifford Avatar answered Oct 07 '22 11:10

Clifford