Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is a pointer with the right address and type still always a valid pointer since C++17?

(In reference to this question and answer.)

Before the C++17 standard, the following sentence was included in [basic.compound]/3:

If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained.

But since C++17, this sentence has been removed.

For example I believe that this sentence made this example code defined, and that since C++17 this is undefined behavior:

 alignas(int) unsigned char buffer[2*sizeof(int)];  auto p1=new(buffer) int{};  auto p2=new(p1+1) int{};  *(p1+1)=10; 

Before C++17, p1+1 holds the address to *p2 and has the right type, so *(p1+1) is a pointer to *p2. In C++17 p1+1 is a pointer past-the-end, so it is not a pointer to object and I believe it is not dereferencable.

Is this interpretation of this modification of the standard right or are there other rules that compensate the deletion of the cited sentence?

like image 726
Oliv Avatar asked Jan 02 '18 14:01

Oliv


People also ask

What is valid pointer in C?

The Pointer in C, is a variable that stores address of another variable. A pointer can also be used to refer to another pointer function. A pointer can be incremented/decremented, i.e., to point to the next/ previous memory location. The purpose of pointer is to save memory space and achieve faster execution time.

How to declare pointers in C?

Declaration of C Pointer variable General syntax of pointer declaration is, datatype *pointer_name; Data type of a pointer must be same as the data type of the variable to which the pointer variable is pointing. void type pointer works with all data types, but is not often used.

Can pointer refer to any type?

A pointer type variable holds the address of a data object or a function. A pointer can refer to an object of any one data type; it cannot refer to a bit field or a reference. Some common uses for pointers are: To access dynamic data structures such as linked lists, trees, and queues.


2 Answers

Is this interpretation of this modification of the standard right or are there other rules that compensate the deletion of this cited sentence?

Yes, this interpretation is correct. A pointer past the end isn't simply convertible to another pointer value that happens to point to that address.

The new [basic.compound]/3 says:

Every value of pointer type is one of the following:
(3.1) a pointer to an object or function (the pointer is said to point to the object or function), or
(3.2) a pointer past the end of an object ([expr.add]), or

Those are mutually exclusive. p1+1 is a pointer past the end, not a pointer to an object. p1+1 points to a hypothetical x[1] of a size-1 array at p1, not to p2. Those two objects are not pointer-interconvertible.

We also have the non-normative note:

[ Note: A pointer past the end of an object ([expr.add]) is not considered to point to an unrelated object of the object's type that might be located at that address. [...]

which clarifies the intent.


As T.C. points out in numerous comments (notably this one), this is really a special case of the problem that comes with trying to implement std::vector - which is that [v.data(), v.data() + v.size()) needs to be a valid range and yet vector doesn't create an array object, so the only defined pointer arithmetic would be going from any given object in the vector to past-the-end of its hypothetical one-size array. Fore more resources, see CWG 2182, this std discussion, and two revisions of a paper on the subject: P0593R0 and P0593R1 (section 1.3 specifically).

like image 138
Barry Avatar answered Sep 22 '22 17:09

Barry


In your example, *(p1 + 1) = 10; should be UB, because it is one past the end of the array of size 1. But we are in a very special case here, because the array was dynamically constructed in a larger char array.

Dynamic object creation is described in 4.5 The C++ object model [intro.object], §3 of the n4659 draft of the C++ standard:

3 If a complete object is created (8.3.4) in storage associated with another object e of type “array of N unsigned char” or of type “array of N std::byte” (21.2.1), that array provides storage for the created object if:
(3.1) — the lifetime of e has begun and not ended, and
(3.2) — the storage for the new object fits entirely within e, and
(3.3) — there is no smaller array object that satisfies these constraints.

The 3.3 seems rather unclear, but the examples below make the intent more clear:

struct A { unsigned char a[32]; }; struct B { unsigned char b[16]; }; A a; B *b = new (a.a + 8) B; // a.a provides storage for *b int *p = new (b->b + 4) int; // b->b provides storage for *p // a.a does not provide storage for *p (directly), // but *p is nested within a (see below) 

So in the example, the buffer array provides storage for both *p1 and *p2.

The following paragraphs prove that the complete object for both *p1 and *p2 is buffer:

4 An object a is nested within another object b if:
(4.1) — a is a subobject of b, or
(4.2) — b provides storage for a, or
(4.3) — there exists an object c where a is nested within c, and c is nested within b.

5 For every object x, there is some object called the complete object of x, determined as follows:
(5.1) — If x is a complete object, then the complete object of x is itself.
(5.2) — Otherwise, the complete object of x is the complete object of the (unique) object that contains x.

Once this is established, the other relevant part of draft n4659 for C++17 is [basic.coumpound] §3(emphasize mine):

3 ... Every value of pointer type is one of the following:
(3.1) — a pointer to an object or function (the pointer is said to point to the object or function), or
(3.2) — a pointer past the end of an object (8.7), or
(3.3) — the null pointer value (7.11) for that type, or
(3.4) — an invalid pointer value.

A value of a pointer type that is a pointer to or past the end of an object represents the address of the first byte in memory (4.4) occupied by the object or the first byte in memory after the end of the storage occupied by the object, respectively. [ Note: A pointer past the end of an object (8.7) is not considered to point to an unrelated object of the object’s type that might be located at that address. A pointer value becomes invalid when the storage it denotes reaches the end of its storage duration; see 6.7. —end note ] For purposes of pointer arithmetic (8.7) and comparison (8.9, 8.10), a pointer past the end of the last element of an array x of n elements is considered to be equivalent to a pointer to a hypothetical element x[n]. The value representation of pointer types is implementation-defined. Pointers to layout-compatible types shall have the same value representation and alignment requirements (6.11)...

The note A pointer past the end... does not apply here because the objects pointed to by p1 and p2 and not unrelated, but are nested into the same complete object, so pointer arithmetics make sense inside the object that provide storage: p2 - p1 is defined and is (&buffer[sizeof(int)] - buffer]) / sizeof(int) that is 1.

So p1 + 1 is a pointer to *p2, and *(p1 + 1) = 10; has defined behaviour and sets the value of *p2.


I have also read the C4 annex on the compatibility between C++14 and current (C++17) standards. Removing the possibility to use pointer arithmetics between objects dynamically created in a single character array would be an important change that IMHO should be cited there, because it is a commonly used feature. As nothing about it exist in the compatibility pages, I think that it confirms that it was not the intent of the standard to forbid it.

In particular, it would defeat that common dynamic construction of an array of objects from a class with no default constructor:

class T {     ...     public T(U initialization) {         ...     } }; ... unsigned char *mem = new unsigned char[N * sizeof(T)]; T * arr = reinterpret_cast<T*>(mem); // See the array as an array of N T for (i=0; i<N; i++) {     U u(...);     new(arr + i) T(u); } 

arr can then be used as a pointer to the first element of an array...

like image 33
Serge Ballesta Avatar answered Sep 24 '22 17:09

Serge Ballesta