Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why must all pointers to structs be of the same size?

The C standard specifies:

A pointer to void shall have the same representation and alignment requirements as a pointer to a character type. Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements.

i.e. sizeof(int*) is not necessarily equal to sizeof(char*) - but sizeof(struct A*) is necessarily equal to sizeof(struct B*).

What is the rationale behind this requirement? As I understand it the rationale behind differing sizes for basic types is to support use cases like near/far/huge pointers (edit: as was pointed out in comments and in the accepted answer, this is not the rationale) - but doesn't this same rationale apply to structs in different locations in memory?

like image 365
Daniel Kleinstein Avatar asked Aug 09 '21 20:08

Daniel Kleinstein


People also ask

Are all pointer sizes the same?

Generally yes, All pointers to anything, whether they point to a int or a long or a string or an array of strings or a function, point to a single memory address, which is the same size on a machine.

What is the size of a pointer to a struct?

Pointers are always the same size on a system no matter what they're pointing to (int, char, struct, etc..); in your case the pointer size is 4 bytes.

Why do we use pointers for structs?

Pointers are helpful because you can "move them around" more easily. Instead of having to copy over the whole stucture each time, you can just leave it where it is in memory and instead pass a pointer to it around.

Why do we use pointers for structs in C?

Structure pointer points to the address of the structure variable in the memory block to which it points. This pointer can be used to access and change the value of structure members. This way, structures and pointers in C can be used to conveniently create and access user-defined data types.


Video Answer


2 Answers

The answer is very simple: struct and union types can be declared as opaque types, ie: without an actual definition of the struct or union details. If the representation of pointers was different depending on the structures' details, how would the compiler determine what representation to use for opaque pointers appearing as arguments, return values, or even just reading from or storing them to memory.

The natural consequence of the ability to manipulate opaque pointer types is all such pointers must have the same representation. Note however that pointers to struct and pointers to union may have a different representation, as well as pointers to basic types such as char, int, double...

Another distinction regarding pointer representation is between pointers to data and pointers to functions, which may have a different size. Such a difference is more common in current architectures, albeit still rare outside operating system and device driver space. 64-bit for function pointers seems a waste as 4GB should be amply sufficient for code space, but modern architectures take advantage of this extra space to store pointer signatures to harden code against malicious attacks. Another use is to take advantage of hardware that ignores some of the pointer bits (eg: x86_64 ignores the top 16 bits) to store type information or to use NaN values unmodified as pointers.

Furthermore, the near/far/huge pointer attributes from legacy 16 bit code were not correctly addressed by this remark in the C Standard as all pointers could be near, far or huge. Yet the distinction between code pointers and data pointers in mixed model code was covered by it and seems still current on some OSes.

Finally, Posix mandates that all pointers have the same size and representation so mixed model code should quickly become a historical curiosity.

It is arguable that architectures where the representation is different for different data types are vanishingly rare nowadays and it be high time to clean up the standard and remove this option. The main objection is support for architectures where the addressable units are large words and 8-bit bytes are addressed using extra information, making char * and void * larger than regular pointers. Yet such architectures make pointer arithmetics very cumbersome and are quite rare too (I personally have never seen one).

like image 77
chqrlie Avatar answered Sep 25 '22 22:09

chqrlie


In the C language invented by Dennis Ritchie, when a C compiler encountered a definition for struct foo *p; it would have no need to care about whether or how the structure was defined unless or until a program used pointer arithmetic or the -> operator. Otherwise, it could simply record that p was a pointer to a structure with tag foo without having to know or care about if, where, or how such a structure might be defined. The Standard adds an odd little wrinkle which sometimes makes structure pointers with matching tags incompatible, but the issue remains that a compiler must be able to process a declaration of a pointer-to-structure type, as well as basic assignments between such pointers, in cases where it might not know the contents of a structure.

Note that on platforms where pointers to objects with arbitrary alignment may be larger than pointers to objects that are known to have int alignment, a compiler might sensibly specify that all structures have int alignment even if they only contain character members. Further, compilers for such platforms might decide to process pointers to unions in such a way as to allow a pointer to any object--even a character--to be converted into a pointer to any union containing such an object, and used to access that object within the union. This may require that pointers to union objects be the size of a byte pointer, rather than a smaller int pointer.

Note that in pre-standard compilers, if two structures contained matching members, a function that accepted a void* and converted it into one structure type would have been expected to be usable to operate on both types interchangeably. Unfortunately, the Standard allows compilers to assume that code will never do such a thing, and provides no means for programmers to indicate when two structures should be usable interchangeably.

like image 34
supercat Avatar answered Sep 21 '22 22:09

supercat