Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reshaping a 1-d array to a multidimensional array

Taking into consideration the entire C++11 standard, is it possible for any conforming implementation to succeed the first assertion below but fail the latter?

#include <cassert>

int main(int, char**)
{  
    const int I = 5, J = 4, K = 3;
    const int N = I * J * K;

    int arr1d[N] = {0};
    int (&arr3d)[I][J][K] = reinterpret_cast<int (&)[I][J][K]>(arr1d);
    assert(static_cast<void*>(arr1d) ==
           static_cast<void*>(arr3d)); // is this necessary?

    arr3d[3][2][1] = 1;
    assert(arr1d[3 * (J * K) + 2 * K + 1] == 1); // UB?
}

If not, is this technically UB or not, and does that answer change if the first assertion is removed (is reinterpret_cast guaranteed to preserve addresses here?)? Also, what if the reshaping is done in the opposite direction (3d to 1d) or from a 6x35 array to a 10x21 array?

EDIT: If the answer is that this is UB because of the reinterpret_cast, is there some other strictly compliant way of reshaping (e.g., via static_cast to/from an intermediate void *)?

like image 682
Stephen Lin Avatar asked Mar 07 '13 23:03

Stephen Lin


People also ask

How do you reshape 1D array to 2D?

convert a 1-dimensional array into a 2-dimensional array by adding new axis. a=np. array([10,20,30,40,50,60]) b=a[:,np. newaxis]--it will convert it to two dimension.

How do you convert 1D to 2D array in Python?

This package consists of a function called numpy. reshape which is used to convert a 1-D array into a 2-D array of required dimensions (n x m). This function gives a new required shape without changing the data of the 1-D array.

How do you convert a 1D array of tuples to a 2D numpy array?

How to convert a 1d array of tuples to a 2d numpy array? Yes it is possible to convert a 1 dimensional numpy array to a 2 dimensional numpy array, by using "np. reshape()" this function we can achiev this.


2 Answers

Update 2021-03-20:

This same question was asked on Reddit recently and it was pointed out that my original answer is flawed because it does not take into account this aliasing rule:

If a program attempts to access the stored value of an object through a glvalue whose type is not similar to one of the following types the behavior is undefined:

  • the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to the dynamic type of the object, or
  • a char, unsigned char, or std​::​byte type.

Under the rules for similarity, these two array types are not similar for any of the above cases and therefore it is technically undefined behaviour to access the 1D array through the 3D array. (This is definitely one of those situations where, in practice, it will almost certainly work with most compilers/targets)

Note that the references in the original answer refer to an older C++11 draft standard

Original answer:

reinterpret_cast of references

The standard states that an lvalue of type T1 can be reinterpret_cast to a reference to T2 if a pointer to T1 can be reinterpret_cast to a pointer to T2 (§5.2.10/11):

An lvalue expression of type T1 can be cast to the type “reference to T2” if an expression of type “pointer to T1” can be explicitly converted to the type “pointer to T2” using a reinterpret_cast.

So we need to determine if a int(*)[N] can be converted to an int(*)[I][J][K].

reinterpret_cast of pointers

A pointer to T1 can be reinterpret_cast to a pointer to T2 if both T1 and T2 are standard-layout types and T2 has no stricter alignment requirements than T1 (§5.2.10/7):

When a prvalue v of type “pointer to T1” is converted to the type “pointer to cv T2”, the result is static_cast<cv T2*>(static_cast<cv void*>(v)) if both T1 and T2 are standard-layout types (3.9) and the alignment requirements of T2 are no stricter than those of T1, or if either type is void.

  1. Are int[N] and int[I][J][K] standard-layout types?

    int is a scalar type and arrays of scalar types are considered to be standard-layout types (§3.9/9).

    Scalar types, standard-layout class types (Clause 9), arrays of such types and cv-qualified versions of these types (3.9.3) are collectively called standard-layout types.

  2. Does int[I][J][K] have no stricter alignment requirements than int[N].

    The result of the alignof operator gives the alignment requirement of a complete object type (§3.11/2).

    The result of the alignof operator reflects the alignment requirement of the type in the complete-object case.

    Since the two arrays here are not subobjects of any other object, they are complete objects. Applying alignof to an array gives the alignment requirement of the element type (§5.3.6/3):

    When alignof is applied to an array type, the result shall be the alignment of the element type.

    So both array types have the same alignment requirement.

That makes the reinterpret_cast valid and equivalent to:

int (&arr3d)[I][J][K] = *reinterpret_cast<int (*)[I][J][K]>(&arr1d);

where * and & are the built-in operators, which is then equivalent to:

int (&arr3d)[I][J][K] = *static_cast<int (*)[I][J][K]>(static_cast<void*>(&arr1d));

static_cast through void*

The static_cast to void* is allowed by the standard conversions (§4.10/2):

A prvalue of type “pointer to cv T,” where T is an object type, can be converted to a prvalue of type “pointer to cv void”. The result of converting a “pointer to cv T” to a “pointer to cv void” points to the start of the storage location where the object of type T resides, as if the object is a most derived object (1.8) of type T (that is, not a base class subobject).

The static_cast to int(*)[I][J][K] is then allowed (§5.2.9/13):

A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T,” where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1.

So the cast is fine! But are we okay to access objects through the new array reference?

Accessing array elements

Performing array subscripting on an array like arr3d[E2] is equivalent to *((E1)+(E2)) (§5.2.1/1). Let's consider the following array subscripting:

arr3d[3][2][1]

Firstly, arr3d[3] is equivalent to *((arr3d)+(3)). The lvalue arr3d undergoes array-to-pointer conversion to give a int(*)[2][1]. There is no requirement that the underlying array must be of the correct type to do this conversion. The pointers value is then accessed (which is fine by §3.10) and then the value 3 is added to it. This pointer arithmetic is also fine (§5.7/5):

If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

This this pointer is dereferenced to give an int[2][1]. This undergoes the same process for the next two subscripts, resulting in the final int lvalue at the appropriate array index. It is an lvalue due to the result of * (§5.3.1/1):

The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points.

It is then perfectly fine to access the actual int object through this lvalue because the lvalue is of type int too (§3.10/10):

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

  • the dynamic type of the object
  • [...]

So unless I've missed something. I'd say this program is well-defined.

like image 150
Joseph Mansfield Avatar answered Sep 28 '22 09:09

Joseph Mansfield


I am under the impression that it will work. You allocate the same piece of contiguous memory. I know the C-standard guarantees it will be contiguous at least. I don't know what is said in the C++11 standard.

However the first assert should always be true. The address of the first element of the array will always be the same. All memory address will be the same since the same piece of memory is allocated.

I would therefore also say that the second assert will always hold true. At least as long as the ordering of the elements are always in row major order. This is also guaranteed by the C-standard and I would be surprised if the C++11 standard says anything differently.

like image 35
AxelOmega Avatar answered Sep 28 '22 09:09

AxelOmega