Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C- Size of structure when bit field is used and how it is stored in memory

Tags:

c

struct

void main()
{
  struct bitfield
  {
    signed int a :3;
    unsigned int b :13;
    unsigned int c :1;
  };

  struct bitfield bit1 = { 2, 14, 1 };
  clrscr();
  printf("%d", sizeof(bit1));
  getch();
}

Why is size 4 bytes here? And how exactly these elements are stored in memory?

like image 666
hello_hi Avatar asked Aug 17 '13 02:08

hello_hi


Video Answer


1 Answers

Almost every aspect of bit fields is implementation defined. Even the signedness of a 'plain int' bit field is implementation defined; it may be signed or unsigned. The layout of the fields — whether they go from most significant bit to least significant bit in the containing 'unit' (the term used in the standard) or from least to most significant is implementation defined. The size of the largest permissible bit field; when a bit field is stored in a new unit; all these are implementation defined.

For example, on Mac OS X 10.8.4 using GCC 4.8.1, it is possible to demonstrate that the struct bitfield in the question is laid out with a occupying the 3 least significant bits (bits 0-2), b occupying the next 13 bits (3-15), and c occupying the next 1 bit (16):

#include <stdio.h>

static void print_info(int v);

int main(void)
{
    int values[] =
    {
        0x55555555, 0xAAAAAAAA, 0x87654321, 0xFEDCBA98,
        0xFEDCBA90, 0xFEDCBA91, 0xFEDCBA92, 0xFEDCBA93,
        0xFEDCBA94, 0xFEDCBA95, 0xFEDCBA96, 0xFEDCBA97,
        0xFEDCBA98, 0xFEDCBAA0, 0xFEDCBAA8, 0x0000BAA0,
        0x0001BAA0, 0x00000008, 0x00000010, 0x00000018,
        0x0000FFF0, 0x0000FFF8,
    };

    for (size_t i = 0; i < sizeof(values)/sizeof(values[0]); i++)
        print_info(values[i]);
    return 0;
}

static void print_info(int v)
{
    union
    {
        unsigned int x;
        struct bitfield
        {
            signed int   a:3;
            unsigned int b:13;
            unsigned int c:1;
        } y;
    } u;
    u.x = v;
    printf("0x%.8X => %2d 0x%.4X %1X\n", u.x, u.y.a, u.y.b, u.y.c);
}

Sample output:

0x55555555 => -3 0x0AAA 1
0xAAAAAAAA =>  2 0x1555 0
0x87654321 =>  1 0x0864 1
0xFEDCBA98 =>  0 0x1753 0
0xFEDCBA90 =>  0 0x1752 0
0xFEDCBA91 =>  1 0x1752 0
0xFEDCBA92 =>  2 0x1752 0
0xFEDCBA93 =>  3 0x1752 0
0xFEDCBA94 => -4 0x1752 0
0xFEDCBA95 => -3 0x1752 0
0xFEDCBA96 => -2 0x1752 0
0xFEDCBA97 => -1 0x1752 0
0xFEDCBA98 =>  0 0x1753 0
0xFEDCBAA0 =>  0 0x1754 0
0xFEDCBAA8 =>  0 0x1755 0
0x0000BAA0 =>  0 0x1754 0
0x0001BAA0 =>  0 0x1754 1
0x00000008 =>  0 0x0001 0
0x00000010 =>  0 0x0002 0
0x00000018 =>  0 0x0003 0
0x0000FFF0 =>  0 0x1FFE 0
0x0000FFF8 =>  0 0x1FFF 0

The test values are not chosen completely at random. From the test values 0xFEDCBA90 to 0xFECBA97, we can see that the least significant 3 bits contain a. From the test values 0x0000BAA0 and 0x0001BAA0, we can see that the 17th bit (or bit 16) contains c. And from the test values 0x00000008 to 0x0000FFF8, we can see that bits 3-15 contain b.

It must, however, be pointed out that the code is debatably portable in theory; since the code writes to u.x and then reads u.x and u.y.a, u.y.b and u.y.c, it is not accessing the member of the union last written to, which is strictly undefined behaviour. In practice, it 'always' works (I've not heard of a system where it doesn't work — it is unlikely but not technically impossible that there is a system where it doesn't work).

This layout is not the only possible layout by any stretch of the imagination. However, I don't have access to compilers or systems that demonstrate alternative layouts.


In ISO/IEC 9899:2011, section §6.7.2.1 Structure and union specifiers says:

¶11 An implementation may allocate any addressable storage unit large enough to hold a bitfield. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

¶12 A bit-field declaration with no declarator, but only a colon and a width, indicates an unnamed bit-field.126) As a special case, a bit-field structure member with a width of 0 indicates that no further bit-field is to be packed into the unit in which the previous bitfield, if any, was placed.

126) An unnamed bit-field structure member is useful for padding to conform to externally imposed layouts.

A slight variant of the structure in the question is:

struct exegesis
{
    signed int   a:3;
    unsigned int  :0;
    unsigned int b:13;
    unsigned int  :0;
    unsigned int c:1;
};

This structure has a size of 12 (on the same compiler/platform as before). The storage unit for bit fields on this platform is 4 bytes, so the anonymous zero-width fields start a new storage unit. a is stored in the least significant 3 bits of the first 4-byte unit; b in the least significant 13 bits of the second 4-byte unit; and c in the least significant bit of the third 4-byte unit. As noted in the quote from the standard, you can have anonymous bit fields that are larger than 0 too.

like image 89
Jonathan Leffler Avatar answered Nov 14 '22 23:11

Jonathan Leffler