Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pointer implementation details in C

I would like to know architectures which violate the assumptions I've listed below. Also, I would like to know if any of the assumptions are false for all architectures (that is, if any of them are just completely wrong).

  1. sizeof(int *) == sizeof(char *) == sizeof(void *) == sizeof(func_ptr *)

  2. The in-memory representation of all pointers for a given architecture is the same regardless of the data type pointed to.

  3. The in-memory representation of a pointer is the same as an integer of the same bit length as the architecture.

  4. Multiplication and division of pointer data types are only forbidden by the compiler. NOTE: Yes, I know this is nonsensical. What I mean is - is there hardware support to forbid this incorrect usage?

  5. All pointer values can be casted to a single integer. In other words, what architectures still make use of segments and offsets?

  6. Incrementing a pointer is equivalent to adding sizeof(the pointed data type) to the memory address stored by the pointer. If p is an int32* then p+1 is equal to the memory address 4 bytes after p.

I'm most used to pointers being used in a contiguous, virtual memory space. For that usage, I can generally get by thinking of them as addresses on a number line. See Stack Overflow question Pointer comparison.

like image 831
Will Bickford Avatar asked Aug 29 '09 21:08

Will Bickford


People also ask

How are pointers implemented in C?

The in-memory representation of all pointers for a given architecture is the same regardless of the data type pointed to. The in-memory representation of a pointer is the same as an integer of the same bit length as the architecture. Multiplication and division of pointer data types are only forbidden by the compiler.

How pointers are implemented with function with example?

In C, we can use function pointers to avoid code redundancy. For example a simple qsort() function can be used to sort arrays in ascending order or descending or by any other order in case of array of structures. Not only this, with function pointers and void pointers, it is possible to use qsort for any data type.

What is pointer in C explain with example?

A pointer is a variable that stores the address of another variable. Unlike other variables that hold values of a certain type, pointer holds the address of a variable. For example, an integer variable holds (or you can say stores) an integer value, however an integer pointer holds the address of a integer variable.


2 Answers

I can't give you concrete examples of all of these, but I'll do my best.

sizeof(int *) == sizeof(char *) == sizeof(void *) == sizeof(func_ptr *)

I don't know of any systems where I know this to be false, but consider:

Mobile devices often have some amount of read-only memory in which program code and such is stored. Read-only values (const variables) may conceivably be stored in read-only memory. And since the ROM address space may be smaller than the normal RAM address space, the pointer size may be different as well. Likewise, pointers to functions may have a different size, as they may point to this read-only memory into which the program is loaded, and which can otherwise not be modified (so your data can't be stored in it).

So I don't know of any platforms on which I've observed that the above doesn't hold, but I can imagine systems where it might be the case.

The in-memory representation of all pointers for a given architecture is the same regardless of the data type pointed to.

Think of member pointers vs regular pointers. They don't have the same representation (or size). A member pointer consists of a this pointer and an offset.

And as above, it is conceivable that some CPU's would load constant data into a separate area of memory, which used a separate pointer format.

The in-memory representation of a pointer is the same as an integer of the same bit length as the architecture.

Depends on how that bit length is defined. :) An int on many 64-bit platforms is still 32 bits. But a pointer is 64 bits. As already said, CPU's with a segmented memory model will have pointers consisting of a pair of numbers. Likewise, member pointers consist of a pair of numbers.

Multiplication and division of pointer data types are only forbidden by the compiler.

Ultimately, pointers data types only exist in the compiler. What the CPU works with is not pointers, but integers and memory addresses. So there is nowhere else where these operations on pointer types could be forbidden. You might as well ask for the CPU to forbid concatenation of C++ string objects. It can't do that because the C++ string type only exists in the C++ language, not in the generated machine code.

However, to answer what you mean, look up the Motorola 68000 CPUs. I believe they have separate registers for integers and memory addresses. Which means that they can easily forbid such nonsensical operations.

All pointer values can be casted to a single integer.

You're safe there. The C and C++ standards guarantee that this is always possible, no matter the memory space layout, CPU architecture and anything else. Specifically, they guarantee an implementation-defined mapping. In other words, you can always convert a pointer to an integer, and then convert that integer back to get the original pointer. But the C/C++ languages say nothing about what the intermediate integer value should be. That is up to the individual compiler, and the hardware it targets.

Incrementing a pointer is equivalent to adding sizeof(the pointed data type) to the memory address stored by the pointer.

Again, this is guaranteed. If you consider that conceptually, a pointer does not point to an address, it points to an object, then this makes perfect sense. Adding one to the pointer will then obviously make it point to the next object. If an object is 20 bytes long, then incrementing the pointer will move it 20 bytes, so that it moves to the next object.

If a pointer was merely a memory address in a linear address space, if it was basically an integer, then incrementing it would add 1 to the address -- that is, it would move to the next byte.

Finally, as I mentioned in a comment to your question, keep in mind that C++ is just a language. It doesn't care which architecture it is compiled to. Many of these limitations may seem obscure on modern CPU's. But what if you're targeting yesteryear's CPU's? What if you're targeting the next decade's CPU's? You don't even know how they'll work, so you can't assume much about them. What if you're targeting a virtual machine? Compilers already exist which generate bytecode for Flash, ready to run from a website. What if you want to compile your C++ to Python source code?

Staying within the rules specified in the standard guarantees that your code will work in all these cases.

like image 170
jalf Avatar answered Sep 28 '22 09:09

jalf


I don't have specific real world examples in mind but the "authority" is the C standard. If something is not required by the standard, you can build a conforming implementation that intentionally fails to comply with any other assumptions. Some of these assumption are true most of the time just because it's convenient to implement a pointer as an integer representing a memory address that can be directly fetched by the processor but this is just a consequent of "convenience" and can't be held as a universal truth.

  1. Not required by the standard (see this question). For instance, sizeof(int*) can be unequal to size(double*). void* is guaranteed to be able to store any pointer value.
  2. Not required by the standard. By definition, size is a part of representation. If the size can be different, the representation can be different too.
  3. Not necessarily. In fact, "the bit length of an architecture" is a vague statement. What is a 64-bit processor, really? Is it the address bus? Size of registers? Data bus? What?
  4. It doesn't make sense to "multiply" or "divide" a pointer. It's forbidden by the compiler but you can of course multiply or divide the underlying representation (which doesn't really make sense to me) and that results in undefined behavior.
  5. Maybe I don't understand your point but everything in a digital computer is just some kind of binary number.
  6. Yes; kind of. It's guaranteed to point to a location that's a sizeof(pointer_type) farther. It's not necessarily equivalent to arithmetic addition of a number (i.e. farther is a logical concept here. The actual representation is architecture specific)
like image 40
mmx Avatar answered Sep 28 '22 09:09

mmx