Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why we can't compare pointers which don't point to elements within the same array?

I have been learning C language and following "Let Us C" by Yashavant P. Kanetkar.

There is a line in the pointers chapter that says we can only compare i.e less than (<) and greater than (>) the pointers which point to the elements that are within the same array.

Why comparing arbitrary pointers is not valid?

like image 226
ashwani Avatar asked Aug 10 '15 08:08

ashwani


1 Answers

Because C makes no assumption about the host machine, and nothing stops the latter from allocating two arrays in two completely separate address spaces.

It's not just about theoretical exotic architectures either. 16-bit compilers for x86 machines provided two kinds of pointers. Near pointers were 16 bits wide and behaved like you'd expect them; however, they only let you access 64k of RAM. If you wanted to access more than 64k of RAM (not 64K for each block: 64K for the whole program!) you had to use far pointers.

Far pointers were 32 bits wide, and made of two 16-bit halves, the segment and the offset; for example 1234:0000 is a pointer that has segment 0x1234 and offset 0. The actual memory address was segment * 16 + offset. Typically, farmalloc returned a pointer with zero offset, and pointer arithmetic only modified the offset. So you could have

 char *x = farmalloc(64);     // returns 1234:0000 for address 0x12340
 char *y = farmalloc(64);     // returns 1238:0000 for address 0x12380

Now if you compute x + 128, the result is 1234:0080, for address 0x123C0. It compares less than 1238:0000 (because 0x1234 < 0x1238) but it points to a higher address (because 0x123C0 > 0x1238).

Why? Because summing 128 to x, which pointed to a 64-byte object, was undefined behavior.

The memory model compiler settings defined whether the default size of pointers was near or far. For example, the "small" memory model had 64K for code and 64K for all of global variables, auto variables (stack) and the malloc heap. Note that the code was in a separate segment, so you couldn't just take a 16-bit ("near") function pointer and dereference it to read machine language! If you had to do that, you had to ask the compiler to put the code in the same segment as the rest (the "tiny" memory model).

Some memory models had the compiler always use far pointers, which was slower but necessary if data+stack+heap exceeded 64K ("compact" or "large" memory models).

The size of code and data was also different, so you could have a memory model where function pointers were near but data pointers were far, or vice versa. This is the case with the aforementioned "compact" model (64K code limit but far pointers for data) and the dual "medium" model (far pointers for code, 64K data limit).

There was also a way for compilers to use flat 32-bit pointers for everything (the so-called "huge" memory model), but it was slow and nobody used it.

like image 130
Quentin Avatar answered Sep 20 '22 00:09

Quentin