Why does padding have to be a power of two?

Tags:

struct

I am doing some sample programs to explore C and would like to know why structure padding can be done in power of two only.

#include <stdio.h>

#pragma pack(push, 3)

union aaaa
{

   struct bbb
   {
      int a;
      double b;
      char c;
   }xx;

   float f;
};

#pragma pack(pop)

int main()
{

printf("\n Size: %d", sizeof(union aaaa));

return 0;
}

While compiling

warning: alignment must be a small power of two, not 3 [-Wpragmas]
warning: #pragma pack (pop) encountered without matching #pragma pack (push) [-Wpragmas]

It seems #pragma has no effect. Output is 24 only. i.e 4 byte aligned.

661

asked Jul 26 '12 10:07

1 Answers

The short answer is that basic objects in processors have sizes that are small powers of two (e.g., 1, 2, 4, 8, and 16 bytes) and memory is organized in groups whose size is a small power of two (e.g., 8 bytes), so structures must be aligned to work well with those sizes.

The long answer is that the reasons for this are grounded in physics and elementary mathematics. Computers naturally work with bits, with values 0 and 1. This is because it is easy to design physical things that switch between two values: High voltage and low voltage, presence of a charge or absence of a charge, et cetera. Distinguishing between three values is harder, because you have to be more sensitive about transitions between values. So, as computer technology has developed over the decades, we have used bits (binary digits) instead of alternatives like trinary digits.

To make larger numbers, we combine multiple bits. So two bits can, combined, have four values. Three bits can have eight values, and so on. In older computers, sometimes bits were grouped six or ten at a time. However, eight became common, and is essentially standard now. Using eight bits for a byte does not have as strong a physical reason as some of the other groupings I describe, but it is the way of the world.

Another feature of computers is memory. Once we have these bytes, we want to store a lot of them, in a device that is easily accessible to the processor, so we can get lots of bytes into and out of the processor quickly. When we have a lot of bytes, we need a way for the processor to tell memory which bytes the processor wants to read or write. So the processor needs a way to address the bytes.

The processor uses bits for values, so it is going to use bits for address values. So the memory will be built to accept bits to indicate which bytes to supply to the processor when the processor reads or which bytes to store when the processor writes. What does the memory device do with those bits? An easy thing is to use one bit to control one switch of the pathways to memory. The memory will be made out of many small parts that store bytes.

Consider a thing in the memory device that can store a byte, and consider two of those things next to each other, say A and B. We can use a switch to select whether we want the A byte to be active or the B byte to be active. Now consider four of those things, say A, B, C, and D. We can use one switch to select whether to use the A-B group or to use the C-D group. Then another switch selects A or B (if using the A-B group) or C or D (if using the C-D) group.

This process continues: Each bit in a memory address selects a group of storage units to use. 1 bit selects between 2 storage units, 2 select between 4, 3 select between 8, 4 select between 16, and so on. 8 bits select between 256 storage units, 24 bits select between 16,777,216 storage units, and 32 bits select between 4,294,967,296 storage units.

There is one more complication. Moving individual bytes between the processor and memory is slow. Instead, modern computers organize memory into larger pieces, such as eight bytes. You can only move eight bytes at a time between memory and the processor. When the processor requests that memory supply some data, the processor only sends the high bits of the address—the low three bits select individual bytes within eight bytes, and they are not sent to memory.

This is faster because the processor gets eight bytes in the time it would otherwise take to have the memory do all its switching to supply one byte, and it is cheaper because you do not need the huge number of extra switches it would take to distinguish individual bytes in memory.

However, now it means the processor has no way of getting an individual byte from memory. When you execute an instruction that accesses an individual byte, the processor must read eight bytes from memory and then shift those bytes around inside the processor to get the one byte you want. Similarly, to get two or four bytes, the processor reads eight bytes and extracts just the bytes you want.

To simplify this process, processor designers specify that data should be aligned in certain ways. Typically, they require two-byte data (like 16-bit integers) to be aligned to multiples of two bytes, four-byte data (like 32-bit integers and 32-bit floating-point values) to be aligned to multiples of four bytes, and eight-byte data to be aligned to multiples of eight bytes.

This required alignment has two effects. First, because four-byte data can only start at two places in an eight-byte chunk read from memory (the beginning or the middle), the processor designers only need to put in wires to extract the four bytes from two places. They do not need to add all the extra wires to extract four bytes from any of the eight individual bytes that could be starting places if any alignment were allowed. (Some processors will completely prohibit loading unaligned data, and some processors will allow it but use slow methods to extract it that use fewer wires but use an iterative algorithm to shift the data over several processor cycles, so unaligned loads are slow.)

The second effect is that, because four-byte data can only start at two places in an eight-byte chunk, it also ends inside that chunk. Consider what would happen if you tried to load four bytes of data that started at the sixth byte of an eight-byte chunk. The first two bytes are in the chunk, but the next two bytes are in the next chunk in memory. The processor would have to read two chunks from memory, take different bytes from each of them, and put those bytes together. That is much slower than just reading one chunk.

So, memory is organized by powers of two because that is a natural result of bits, and processors require alignment because that makes memory access more efficient. The alignment is naturally powers of two, and that is why your structure sizes work better when they are multiples of the powers of two that are used for alignment.

182

answered Sep 28 '22 23:09

Eric Postpischil

Related questions
                            
                                Does C language support inheritance?
                            
                                Is there a standard way to detect bit width of hardware?
                            
                                Why Bitshift operators?
                            
                                How do the "->" and "." member access operations differ in C
                            
                                char arrays length in C
                            
                                C strtok() split string into tokens but keep old data unaltered
                            
                                How can I test this C function
                            
                                Does the -O0 compiler flag have the same effect as the volatile keyword in C?
                            
                                is it possible to use regex in c++?
                            
                                Is assert and unit-testing incompatible?
                            
                                C compiler from where?
                            
                                C# and C++ in relation to C
                            
                                Understanding the '&' operator
                            
                                Does sizeof(float) always equal to sizeof(int) on all architectures?
                            
                                Why does this C code produce a double instead of a float?
                            
                                How do you deal with NUL?
                            
                                Are comments always processed before the preprocessor? [duplicate]
                            
                                Character pointers and integer pointers (++)
                            
                                C Void pointer question
                            
                                Cause for this difference..from C++ to C?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does padding have to be a power of two?

Tags:

c

struct

Jeyaram

People also ask

1 Answers

Eric Postpischil

Recent Activity

Donate For Us