Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does padding have to be a power of two?

Tags:

c

struct

I am doing some sample programs to explore C and would like to know why structure padding can be done in power of two only.

#include <stdio.h>

#pragma pack(push, 3)

union aaaa
{

   struct bbb
   {
      int a;
      double b;
      char c;
   }xx;

   float f;
};

#pragma pack(pop)

int main()
{

printf("\n Size: %d", sizeof(union aaaa));

return 0;
}

While compiling

warning: alignment must be a small power of two, not 3 [-Wpragmas]
warning: #pragma pack (pop) encountered without matching #pragma pack (push) [-Wpragmas]

It seems #pragma has no effect. Output is 24 only. i.e 4 byte aligned.

like image 661
Jeyaram Avatar asked Jul 26 '12 10:07

Jeyaram


People also ask

What is meant by padding and why padding is required?

Bits or characters that fill up unused portions of a data structure, such as a field, packet or frame. Typically, padding is done at the end of the structure to fill it up with data, with the padding usually consisting of 1 bits, blank characters or null characters.

Why do we need padding in C?

The answer to that lies in how a CPU accesses memory. Typically a CPU has alignment constraints, e.g. a CPU will access one word at a time, or a CPU will require data to be 16byte aligned, etc. So to make sure that data is aligned according to the constraints of the CPU, padding is required.

How many padding bytes are necessary?

The total number of padding bytes is at least one, and is the number that is required in order to bring the data length up to a multiple of the cipher algorithm block size.

What is the purpose of data structure padding?

Structure padding is a concept in C that adds the one or more empty bytes between the memory addresses to align the data in memory.


1 Answers

The short answer is that basic objects in processors have sizes that are small powers of two (e.g., 1, 2, 4, 8, and 16 bytes) and memory is organized in groups whose size is a small power of two (e.g., 8 bytes), so structures must be aligned to work well with those sizes.

The long answer is that the reasons for this are grounded in physics and elementary mathematics. Computers naturally work with bits, with values 0 and 1. This is because it is easy to design physical things that switch between two values: High voltage and low voltage, presence of a charge or absence of a charge, et cetera. Distinguishing between three values is harder, because you have to be more sensitive about transitions between values. So, as computer technology has developed over the decades, we have used bits (binary digits) instead of alternatives like trinary digits.

To make larger numbers, we combine multiple bits. So two bits can, combined, have four values. Three bits can have eight values, and so on. In older computers, sometimes bits were grouped six or ten at a time. However, eight became common, and is essentially standard now. Using eight bits for a byte does not have as strong a physical reason as some of the other groupings I describe, but it is the way of the world.

Another feature of computers is memory. Once we have these bytes, we want to store a lot of them, in a device that is easily accessible to the processor, so we can get lots of bytes into and out of the processor quickly. When we have a lot of bytes, we need a way for the processor to tell memory which bytes the processor wants to read or write. So the processor needs a way to address the bytes.

The processor uses bits for values, so it is going to use bits for address values. So the memory will be built to accept bits to indicate which bytes to supply to the processor when the processor reads or which bytes to store when the processor writes. What does the memory device do with those bits? An easy thing is to use one bit to control one switch of the pathways to memory. The memory will be made out of many small parts that store bytes.

Consider a thing in the memory device that can store a byte, and consider two of those things next to each other, say A and B. We can use a switch to select whether we want the A byte to be active or the B byte to be active. Now consider four of those things, say A, B, C, and D. We can use one switch to select whether to use the A-B group or to use the C-D group. Then another switch selects A or B (if using the A-B group) or C or D (if using the C-D) group.

This process continues: Each bit in a memory address selects a group of storage units to use. 1 bit selects between 2 storage units, 2 select between 4, 3 select between 8, 4 select between 16, and so on. 8 bits select between 256 storage units, 24 bits select between 16,777,216 storage units, and 32 bits select between 4,294,967,296 storage units.

There is one more complication. Moving individual bytes between the processor and memory is slow. Instead, modern computers organize memory into larger pieces, such as eight bytes. You can only move eight bytes at a time between memory and the processor. When the processor requests that memory supply some data, the processor only sends the high bits of the address—the low three bits select individual bytes within eight bytes, and they are not sent to memory.

This is faster because the processor gets eight bytes in the time it would otherwise take to have the memory do all its switching to supply one byte, and it is cheaper because you do not need the huge number of extra switches it would take to distinguish individual bytes in memory.

However, now it means the processor has no way of getting an individual byte from memory. When you execute an instruction that accesses an individual byte, the processor must read eight bytes from memory and then shift those bytes around inside the processor to get the one byte you want. Similarly, to get two or four bytes, the processor reads eight bytes and extracts just the bytes you want.

To simplify this process, processor designers specify that data should be aligned in certain ways. Typically, they require two-byte data (like 16-bit integers) to be aligned to multiples of two bytes, four-byte data (like 32-bit integers and 32-bit floating-point values) to be aligned to multiples of four bytes, and eight-byte data to be aligned to multiples of eight bytes.

This required alignment has two effects. First, because four-byte data can only start at two places in an eight-byte chunk read from memory (the beginning or the middle), the processor designers only need to put in wires to extract the four bytes from two places. They do not need to add all the extra wires to extract four bytes from any of the eight individual bytes that could be starting places if any alignment were allowed. (Some processors will completely prohibit loading unaligned data, and some processors will allow it but use slow methods to extract it that use fewer wires but use an iterative algorithm to shift the data over several processor cycles, so unaligned loads are slow.)

The second effect is that, because four-byte data can only start at two places in an eight-byte chunk, it also ends inside that chunk. Consider what would happen if you tried to load four bytes of data that started at the sixth byte of an eight-byte chunk. The first two bytes are in the chunk, but the next two bytes are in the next chunk in memory. The processor would have to read two chunks from memory, take different bytes from each of them, and put those bytes together. That is much slower than just reading one chunk.

So, memory is organized by powers of two because that is a natural result of bits, and processors require alignment because that makes memory access more efficient. The alignment is naturally powers of two, and that is why your structure sizes work better when they are multiples of the powers of two that are used for alignment.

like image 182
Eric Postpischil Avatar answered Sep 28 '22 23:09

Eric Postpischil