Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What C compilers have pointer subtraction underflows?

So, as I learned from Michael Burr's comments to this answer, the C standard doesn't support integer subtraction from pointers past the first element in an array (which I suppose includes any allocated memory).

From section 6.5.6 of the combined C99 + TC1 + TC2 (pdf):

If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

I love pointer arithmetic, but this has never been something I've worried about before. I've always assumed that given:

 int a[1];
 int * b = a - 3;
 int * c = b + 3;

That c == a.

So while I believe I've done that sort of thing before, and not gotten bitten, it must have been due to the kindness of the various compilers I've worked with - that they've gone above and beyond what the standards require to make pointer arithmetic work the way I thought it did.

So my question is, how common is that? Are there commonly used compilers that don't do that kindness for me? Is proper pointer arithmetic beyond the bounds of an array a defacto standard?

like image 305
rampion Avatar asked Jan 21 '26 13:01

rampion


2 Answers

MSDOS FAR pointers had problems like this, which were usually covered over by "clever" use of the overlap of the segment register with the offset register in real-mode. The effect there was that the 16-bit segment was a shifted left 4 bits, and added to the 16-bit offset which gave a 20-bit physical address that could address 1MB, which was plenty because everyone knew that noone would ever need as much as 640KB of RAM. ;-)

In protected mode, the segment register was actually an index into a table of memory descriptors. A typical DOS extending runtime would usually arrange things so that many segments could be treated just like they would have been in real mode, which made porting code from real mode easy. But it had some defects. Primarily, the segment before an allocation was not part of the allocation, and so its descriptor might not even be valid.

On the 80286 in protected mode, just loading a segment register with a value that would cause an invalid descriptor to load would cause an exception, whether or not the descriptor was actually used to refer to memory.

A similar issue potentially occurs at one byte past the allocation. The last ++ on the pointer might have carried over to the segment register, causing it to load a new descriptor. In this case, it is reasonable to expect that the memory allocator could arrange for one safe descriptor past the end of the allocated range, but it would be unreasonable to expect it to arrange for any more than that.

like image 122
RBerteig Avatar answered Jan 24 '26 02:01

RBerteig


This is not "implementation defined" by the Standard, this is "undefined" by the Standard. Which means that you can't count on a compiler supporting it, you can't say, "well, this code is safe on compiler X". By invoking undefined behavior, your program is undefined.

The practical answer isn't "how (where, when, on what compiler) can I get away with this"; the practical answer is "don't do this".

like image 33
tpdi Avatar answered Jan 24 '26 01:01

tpdi



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!