Apparent contradiction between Stroustrup book and the C++Standard

Question

I'm trying to understand the following paragraph from Stroustrup's "The C++ Programming Language" on page 282 (emphasis is mine):

To deallocate space allocated by new, delete and delete[] must be able to determine the size of the object allocated. This implies that an object allocated using the standard implementation of new will occupy slightly more space than a static object. At a minimum, space is needed to hold the object’s size. Usually two or more words per allocation are used for free-store management. Most modern machines use 8-byte words. This overhead is not significant when we allocate many objects or large objects, but it can matter if we allocate lots of small objects (e.g., ints or Points) on the free store.

Note that the author doesn't differentiate whether the object is an array, or not, in the sentence highlighted above.

But according to paragraph §5.3.4/11 in C++14, we have (my emphasis):

When a new-expression calls an allocation function and that allocation has not been extended, the new-expression passes the amount of space requested to the allocation function as the first argument of type std::size_t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array.

I may be missing something, but it seems to me, we have a contradiction in those two statements. It was my understanding that the additional space required was only for array objects, and that this additional space would hold the number of elements in the array, not the array size in bytes.

Yakk - Adam Nevraumont · Accepted Answer

If you call new on a type T, the overloaded operator new that may be invoked will be passed exactly sizeof(T).

If you implement a new of your own (or an allocator) that uses some different memory store (ie, not just forwarding to another call to new or malloc etc), you'll find yourself wanting to store information to clean up the allocation later, when the delete occurs. A typical way to do this is to get a slightly larger block of memory, and store the amount of memory requested at the start of it, then return a pointer to later in the memory you acquired.

This is roughly what most standard implementations of new (and malloc do).

So while you only need sizeof(T) bytes to store a T, the amount of bytes consumed by new/malloc is more than sizeof(T). This is what Stroustrup is talking about: every dynamic allocation has actual overhead, and that overhead can be substantial if you make lots of small allocations.

There are some allocators that don't need that extra room "before" the allocation. For example, a stack-scoped allocator that doesn't delete anything until it goes out of scope. Or one that allocates from stores of fixed-sized blocks and uses a bitfield to describe which are in use.

Here, the accounting information isn't store adjacent to the data, or we make the accounting information implicit in the code state (scoped allocators).

Now, in the case of arrays, the C++ compiler is free to call operator new[] with an amount of memory requested larger than sizeof(T)*n when T[n] is allocated. This is done by new (not operator new) code generated by the compiler when it asks your overload for memory.

This is traditionally done on types with non-trivial destructors so that the C++ runtime can, when delete[] is called, iterate over each of the items and call .~T() on them. It pulls off a similar trick, where it stuffs n into memory before the array it is using, then does pointer arithmetic to extract it at delete time.

This is not required by the standard, but it is a common technique (clang and gcc both do it at least on some platforms, and I believe MSVC does as well). Some method of calculating the size of the array is needed; this is just one of them.

For something without a destructor (like char) or a trivial one (like struct foo{ ~foo()=default; }, n isn't needed by the runtime, so it doesn't have to store it. So it can say "naw, I won't store it".

Here is a live example.

struct foo {   static void* operator new[](std::size_t sz) {     std::cout << sz << '/' << sizeof(foo) << '=' << sz/sizeof(foo) << "+ R(" << sz%sizeof(foo) << ")" << '
';     return malloc(sz);   }   static void operator delete[](void* ptr) {     free(ptr);   }   virtual ~foo() {} };  foo* test(std::size_t n) {   std::cout << n << '
';   return new foo[n]; }  int main(int argc, char**argv) {   foo* f = test( argc+10 );   std::cout << *std::prev(reinterpret_cast<std::size_t*>(f)) << '
'; }

If run with 0 arguments, it prints out 11, 96/8 = 12 R(0) and 11.

The first is the number of elements allocated, the second is how much memory is allocated (which adds up to 11 element's worth, plus 8 bytes -- sizeof(size_t) I suspect), the last is what we happen to find right before the start of the array of 11 elements (a size_t with the value 11).

Accessing memory before the start of the array is naturally undefined behavior, but I did it in order to expose some implementation details in gcc/clang. The point is that they did ask for an extra 8 bytes (as predicted), and they did happen to store the value 11 there (the size of the array).

If you change that 11 to 2, a call to delete[] will actually delete the wrong number of elements.

Other solutions (to store how big the array is) are naturally possible. As an example, if you know you aren't calling an overload of new and you know details of your underlying memory allocation, you could reuse the data it uses to know your block size to determine the number of elements, thus saving an extra size_t of memory. This requires knowing that your underlying allocator won't over-allocate on you, and that it stores the bytes used at a known offset to the data-pointer.

Or, in theory, a compiler could build a separate pointer->size map.

I am unaware of compilers that do either of these, but would be surprised by neither.

Allowing this technique is what the C++ standard is talking about. For array allocation, the compiler's new (not operator new) code is permitted to ask operator new for extra memory. For non-array allocation, the compiler's new is not permitted to ask operator new for extra memory, it must ask for the exact amount. (I believe there may be exceptions for memory-allocation merging?)

As you can see, the two situations are different.

Apparent contradiction between Stroustrup book and the C++Standard

Tags:

John Kalane

1 Answers

Yakk - Adam Nevraumont

Recent Activity

Donate For Us

Apparent contradiction between Stroustrup book and the C++Standard

Tags:

John Kalane

1 Answers

Yakk - Adam Nevraumont

Related questions

Recent Activity

Donate For Us