Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why position of `[0]byte` in the struct matters?

[0]byte in golang should not take any memory space. But these two structs have different sizes.

type bar2 struct {
    A int
    _ [0]byte
}

type bar3 struct {
    _ [0]byte
    A int   
}

So why the position of [0]byte matters here?

By the way, I use unsafe.Sizeof() method to check the struct size. See the full example .

like image 729
Rader Avatar asked Jul 12 '17 03:07

Rader


People also ask

Why does the order of members in a struct sometimes matter?

The order of fields in a struct does matter - the compiler is not allowed to reorder fields, so the size of the struct may change as the result of adding some padding. The struct must have at least one member in addition to the flexible one.

Are structs 8 byte aligned?

Assuming a 64-bit machine, any instance of struct foo1 will have 8-byte alignment.

Why do structs need padding?

The answer to that lies in how a CPU accesses memory. Typically a CPU has alignment constraints, e.g. a CPU will access one word at a time, or a CPU will require data to be 16byte aligned, etc. So to make sure that data is aligned according to the constraints of the CPU, padding is required.

How much memory does a struct use?

Structure Padding Processor doesn't read 1byte at a time from memory.It reads 1 word at a time. In 32 bit processor, it can access 4 bytes at a time which means word size is 4 bytes. Similarly in a 64 bit processor, it can access 8 bytes at a time which means word size is 8 bytes.


2 Answers

This is due to a tricky padding.

First please allow me to slightly rename the structs and fields so it'll be easier to talk about them:

type bar1 struct {
    A [0]byte
    I int
}

type bar2 struct {
    I int
    A [0]byte
}

This of course doesn't change the size and offsets as can be verified on the Go Playground:

bar1 size:     4
bar1.A offset: 0
bar1.I offset: 0

bar2 size:     8
bar2.I offset: 0
bar2.A offset: 4

The size of a value of type [0]byte is zero, so it is perfectly valid in bar1 to not reserve any space for the first field (bar1.A), and lay out the bar1.I field with 0 offset.

The question is: why can't the compiler do the same in the 2nd case (with bar2)?

A field must have an address that must be after the memory area reserved for the previous field. In the first case the first field bar1.A has 0 size, so the 2nd field may have 0 offset, it will not "overlap" with the first field.

In case of bar2, the second field cannot have an address (and therefore an offset) that overlaps with the first field, so its offset cannot be less than the size of int which is 4 bytes in case of 32-bit architectures (and 8 bytes in case of 64-bit arch).

This still seems ok. But since bar2.A has zero size, why can't the size of the struct bar2 be just that: 4 bytes (or 8 in 64-bit arch)?

This is because it is perfectly valid to take the address of fields (and variables) that have 0 size. Ok, so what?

In case of bar2, the compiler has to insert a 4 (or 8) byte padding, else taking the address of a bar2.A field would point outside of the memory area reserved for a value of type bar2.

As an example, without padding a value of bar2 may have an address of 0x100, size 4, so memory reserved for the struct value has address range 0x100 .. 0x103. Address of bar2.A would be 0x104, that is outside of the struct's memory. In case of an array of this struct (e.g. x [5]bar2), if the array starts at 0x100, address of x[0] would be 0x100, address of x[0].A would be 0x104, and address of the subsequent element x[1] would also be 0x104 but that's the address of another struct value! Not cool.

To avoid this, the compiler inserts a padding (which will be 4 or 8 bytes depending on the arch), so that taking the address of bar2.A will not result in an address being outside of the struct's memory, which otherwise could raise questions and cause problems regarding garbage collection (e.g. if only address of bar2.A is kept but not the struct or another pointer to it or its other fields, the whole struct should not be garbage collected, but since no pointer points to its memory area, it would seem to be valid to do so). The inserted padding will be 4 (or 8) bytes, because Spec: Size and alignment guarantees:

For a variable x of struct type: unsafe.Alignof(x) is the largest of all the values unsafe.Alignof(x.f) for each field f of x, but at least 1.

If this is so, adding an additional int field would make the size of both structs equal:

type bar1 struct {
    I int
    A [0]byte
    X int
}

type bar2 struct {
    A [0]byte
    I int
    X int
}

And truly they both have 8 bytes on 32-bit arch (and 16 bytes on 64-bit arch) (try it on the Go Playground):

bar1 size:     8
bar1.I offset: 0
bar1.A offset: 4
bar1.X offset: 4

bar2 size:     8
bar2.A offset: 0
bar2.I offset: 0
bar2.X offset: 4

See related question: Struct has different size if the field order is different

like image 153
icza Avatar answered Sep 18 '22 22:09

icza


The reason is "holes":

Holes are the unused spaces added by the compiler to ensure that the following field or element is properly aligned relative to the start of the struct or array [1]

For example (numbers based on whatever hardware the go playground is using):

struct {bool; float64; int16} // 24 bytes
struct {float64; bool; int16} // 16 bytes

You can verify the layout of a struct using:

  • unsafe.Alignof returns the required alignment
  • unsafe.Offsetof computes the offset of a field relative to the start of it's enclosing struct including holes

[1] p354 Donovan, Kernighan, AD, BK, 2016. The GO Programming Language. 1st ed. New York: Addison-Wesley.

like image 25
shusson Avatar answered Sep 17 '22 22:09

shusson