Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alignment of char arrays

Tags:

c++

alignment

How is STL vector usually implemented? It has a raw storage of char[] which it occasionally resizes by a certain factor and then calls placement new when an element is pushed_back (a very interesting grammatical form I should note - linguists should study such verb forms as pushed_back :)
And then there are the alignment requirements. So a natural question arises how can I call a placement new on a char[] and be sure the alignment requirements are satisfied. So I searched the C++ standard of 2003 for the word "alignment" and found these:

Paragraph 3.9 Clause 5

Object types have alignment requirements (3.9.1, 3.9.2). The alignment of a complete object type is an implementation-defined integer value representing a number of bytes; an object is allocated at an address that meets the alignment requirements of its object type.

Paragraph 5.3.4 Clause 10:

A new-expression passes the amount of space requested to the allocation function as the first argument of type std::size_t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array. For arrays of char and unsigned char, the difference between the result of the new-expression and the address returned by the allocation function shall be an integral multiple of the most stringent alignment requirement (3.9) of any object type whose size is no greater than the size of the array being created. [Note: Because allocation functions are assumed to return pointers to storage that is appropriately aligned for objects of any type, this constraint on array allocation overhead permits the common idiom of allocating character arrays into which objects of other types will later be placed. ]

These two give a perfectly satisfactory answer for my above question, but...

Statement1:
An alignment requirement for an object of type X where sizeof(X) == n is at least the requirement that address of X be divisible by n or something like that (put all the architecture-dependent things into the "or something like that").

Question1: Please confirm, refine, or deny the above statement1.

Statement2: If statement1 is correct then from the second quote in the standard it follows that an array of 5000000 chars is allocated at an address divisible by 5000000 which is completely unnecessary if I just need the array of char as such, not as a raw storage for possible placement of other objects.

Question2: So, are the chances of successfully allocating 1000 chars really lower than 500 shorts(provided short is 2 bytes)? Is it practically a problem?

like image 207
Armen Tsirunyan Avatar asked Oct 24 '10 17:10

Armen Tsirunyan


2 Answers

When you dynamically allocate memory using operator new, you have the guarantee that:

The pointer returned shall be suitably aligned so that it can be converted to a pointer of any complete object type and then used to access the object or array in the storage allocated (until the storage is explicitly deallocated by a call to a corresponding deallocation function) (C++03 3.7.3.1/2).

vector does not create an array of char; it uses an allocator. The default allocator uses ::operator new to allocate memory.

like image 117
James McNellis Avatar answered Sep 27 '22 03:09

James McNellis


An alignment requirement for an object of type X where sizeof(X) == n is at least the requirement that address of X be divisible by n or something like that

No. The alignment requirement of a type is always a factor of its size, but need not be equal to its size. It is usually equal to the greatest of the alignment requirements of all the members of a class.

An array of 5M char, on its own account, need only have an alignment requirement of 1, the same as the alignment requirement of a single char.

So, the text you quote about the alignment of memory allocated via global operator new, (and malloc has a similar although IIRC not identical requirement) in effect means that a large allocation must obey the most stringent alignment requirement of any type in the system. Further to that, implementations often exclude large SIMD types from this, and require that memory for SIMD be specially allocated. This is slightly dubious, but I think they justify it on the basis that non-standard, extension types can impose arbitrary special requirements.

So in practice the number which you think is 5000000 is often 4 :-)

like image 20
Steve Jessop Avatar answered Sep 26 '22 03:09

Steve Jessop