Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

treating memory returned by operator new(sizeof(T) * N) as an array

In C one can allocate dynamic arrays using malloc(sizeof(T) * N) and then use pointer arithmetic to get elements at i offset in this dynamic array.

In C++ one can do similar using operator new() in the same way as malloc() and then placement new (for an example one can see solution for item 13 in a book "Exceptional C++: 47 engineering puzzles, programming problems, and solutions" by Herb Sutter). If you don't have one, the summary of the solution for this question would be:

T* storage = operator new(sizeof(T)*size);

// insert element    
T* p = storage + i;
new (p) T(element);

// get element
T* element = storage[i];

For me this looked legit since I'm asking for a chunk of memory with enough memory to hold N aligned elements of size = sizeof(T). Since sizeof(T) should return a size of element which is aligned, and they are laid one after another in a chunk of memory, using pointer arithmetic is OK here.

However I was then pointed to links like: http://eel.is/c++draft/expr.add#4 or http://eel.is/c++draft/intro.object#def:object and claiming that in C++ operator new() does not return an array object, so pointer arithmetic over what it has returned and using it as an array is undefined behavior as opposed to ANSI C.

I'm not this good at such low level stuff and I'm really trying to understand by reading this: https://www.ibm.com/developerworks/library/pa-dalign/ or this: http://jrruethe.github.io/blog/2015/08/23/placement-new/ but I still fail to understand if Sutter was just plain wrong?

I do understand that alignas make sense in constructions such as:

alignas(double) char array[sizeof(double)];

(c) http://georgeflanagin.com/alignas.php

If array appears to be not in a boundary of double (perhaps following char in a structure ran at 2-byte reading processor).

But this is different - I've requested memory from the heap/free storage especially requested operator new to return memory which will hold elements aligned to sizeof(T).

To summarize in case this was TL;DR:

  • Is it possible to use malloc() for dynamic arrays in C++?
  • Is it possible to use operator new() and placement new for dynamic arrays in older C++ which has no alignas keyword?
  • Is pointer arithmetic undefined behavior when used over memory returned by operator new()?
  • Is Sutter advising code which might break on some antique machine?

Sorry if this is dumb.

like image 870
xor256 Avatar asked Nov 23 '18 18:11

xor256


3 Answers

The issue of pointer arithmetic on allocated memory, as in your example:

T* storage = static_cast<T*>(operator new(sizeof(T)*size));
// ...
T* p = storage + i;  // precondition: 0 <= i < size
new (p) T(element);

being technically undefined behaviour has been known for a long time. It implies that std::vector can't be implemented with well-defined behaviour purely as a library, but requires additional guarantees from the implementation beyond those found in the standard.

It was definitely not the intention of the standards committee to make std::vector unimplementable. Sutter is, of course, right that such code is intended to be well-defined. The wording of the standard needs to reflect that.

P0593 is a proposal that, if accepted into the standard, may be able to solve this problem. In the meantime, it is fine to keep writing code like the above; no major compiler will treat it as UB.

Edit: As pointed out in the comments, I should have stated that when I said storage + i will be well-defined under P0593, I was assuming that the elements storage[0], storage[1], ..., storage[i-1] have already been constructed. Although I'm not sure I understand P0593 well enough to conclude that it wouldn't also cover the case where those elements hadn't already been constructed.

like image 170
Brian Bi Avatar answered Oct 17 '22 04:10

Brian Bi


The C++ standards contain an open issue that underlying representation of objects is not an "array" but a "sequence" of unsigned char objects. Still, everyone treats it as an array (which is intended), so it is safe to write the code like:

char* storage = static_cast<char*>(operator new(sizeof(T)*size));
// ...
char* p = storage + sizeof(T)*i;  // precondition: 0 <= i < size
new (p) T(element);

as long as void* operator new(size_t) returns a properly aligned value. Using sizeof-multiplied offsets to keep the alignment is safe.

In C++17, there is a macro STDCPP_DEFAULT_NEW_ALIGNMENT, which specifies the maximum safe alignment for "normal" void* operator new(size_t), and void* operator new(std::size_t size, std::align_val_t alignment) should be used if a larger alignment is required.

In earlier versions of C++, there is no such distinction, which means that void* operator new(size_t) needs to be implemented in a way that is compatible with the alignment of any object.

As to being able to do pointer arithmetic directly on T*, I am not sure it needs to be required by the standard. However, it is hard to implement the C++ memory model in such a way that it would not work.

like image 44
Kit. Avatar answered Oct 17 '22 03:10

Kit.


To all of the widely used recent posix-compatible systems, that is, Windows, Linux (& Android ofc.), and MacOSX the followings apply

Is it possible to use malloc() for dynamic arrays in C++?

Yes it is. Using reinterpret_cast to convert the resulting void* to the desired pointer type is the best practice, and it yields in a dynamically allocated array like this: type *array = reinterpret_cast<type*>(malloc(sizeof(type)*array_size); Be careful, that in this case constructors are not called on array elements, therefore it is still an uninitialized storage, no matter what type is. Nor destructors are called when free is used for deallocations


Is it possible to use operator new() and placement new for dynamic arrays in older C++ which has no alignas keyword?

Yes, but you need to be aware of alignment in case of placement new, if you feed it with custom locations (i.e ones that do not come from malloc/new). Normal operator new, as well as malloc, will provide native word aligned memory areas (at least whenever allocation size >= wordsize). This fact and the one that structure layouts and sizes are determined so that alignment is properly considered, you don't need to worry about alignment of dyn arrays if malloc or new is used. One might notice, that word size is sometimes significantly smaller than the greatest built-in data type (which is typically long double), but it must be aligned the same way, since alignment is not about data size, but the bit width of addresses on memory bus for different access sizes.


Is pointer arithmetic undefined behavior when used over memory returned by operator new()?

Nope, as long as you respect the process' memory boundaries -- from this point of view new basically works the same way as malloc, moreover, new actually calls malloc in the vast majority of implementations in order to acquire the required area. As a matter of fact, pointer arithmetic as such is never invalid. However, result of an arithmetic expression that evaluates to a pointer might point to a location outside of permitted areas, but this is not the fault of pointer arithmetic, but of the flawed expression.


Is Sutter advising code which might break on some antique machine?

I don't think so, provided the right compiler is used. (don't compile avr instructions or 128-bit-wide memory mov's into a binary that's intended to run on a 80386) Of course, on different machines with different memory sizes and layouts the same literal address may access areas of different purpose/status/existence, but why would you use literal addresses unless you write driver code to a specific hardware?... :)

like image 34
Géza Török Avatar answered Oct 17 '22 04:10

Géza Török