Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Accessing struct data members via pointer arithmetic

If I have a simple tensor class like this

struct Tensor
{
    double XX, XY, XZ;
    double YX, YY, YZ;
    double ZX, ZY, ZZ;
}

Is it undefined behavior to use pointer-arithmetic (see below) to access its elements?

 double& Tensor::operator[](int i) 
{ 
    assert(i < 9); 
    return (&XX)[i]; 
}
like image 408
Touloudou Avatar asked Oct 17 '19 08:10

Touloudou


3 Answers

There's a cppcon talk that mentions this!

So yes, it's undefined behaviour, because classes and arrays don't share a common initial sequence.

Edit: Miro Knejp introduces that slide at around 3:44 if you want more context for all the non-c++ on the slide, but the question and answer is really the only part of the talk that goes into your question.

like image 65
GeckoGeorge Avatar answered Nov 15 '22 18:11

GeckoGeorge


Yes, it is undefined behavior.

The data members are not in an array, and thus are NOT guaranteed to be stored back-to-back in contiguous memory, as pointer arithmetic would require. There may be indeterminate padding generated between them.

The correct way would be to access the members individually, eg:

double& Tensor::operator[](int i)
{
    switch (i)
    {
        case 0: return XX;
        case 1: return XY;
        case 2: return XZ;
        case 3: return YX;
        case 4: return YY;
        case 5: return YZ;
        case 6: return ZX;
        case 7: return ZY;
        case 8: return ZZ;
        default: throw std::out_of_range("invalid index");
    }
}

Alternatively, if you really want to use array syntax:

double& Tensor::operator[](int i)
{
    if ((i < 0) || (i > 8))
        throw std::out_of_range("invalid index");

    double* arr[] = {
        &XX, &XY, &XZ,
        &YX, &YY, &YZ, 
        &ZX, &ZY, &ZZ
    };

    return *(arr[i]);
}

Or

double& Tensor::operator[](int i)
{
    if ((i < 0) || (i > 8))
        throw std::out_of_range("invalid index");

    static double Tensor::* arr[] = {
        &Tensor::XX, &Tensor::XY, &Tensor::XZ,
        &Tensor::YX, &Tensor::YY, &Tensor::YZ, 
        &Tensor::ZX, &Tensor::ZY, &Tensor::ZZ
    };

    return this->*(arr[i]);
}

Otherwise, use an actual array for the data, and define methods to access the elements:

struct Tensor
{
    double data[9];

    double& XX() { return data[0]; }
    double& XY() { return data[1]; }
    double& XZ() { return data[2]; }
    double& YX() { return data[3]; }
    double& YY() { return data[4]; }
    double& YZ() { return data[5]; }
    double& ZX() { return data[6]; }
    double& ZY() { return data[7]; }
    double& ZZ() { return data[8]; }

    double& operator[](int i)
    {
        if ((i < 0) || (i > 8))
            throw std::out_of_range("invalid index");
        return data[i];
    }
};
like image 27
Remy Lebeau Avatar answered Nov 15 '22 17:11

Remy Lebeau


It is undefined behavior.

In general, pointer arithmetic is properly defined only for the members of an array (and maybe one element after, as described in section §8.5.6 of the standard).

For classes/structures, this can't work, because the compiler can add padding or other data between the members. cppreference has a brief description of the class layout.

Now, moving to solutions to your problem, the first one would be to simply use something made for this, such as Eigen. It is a mature library for linear algebra, with well tested code and good optimizations.

If you are not interested in adding a new library, you would have to more or less implement manually either member access or operator[].

like image 33
Paul92 Avatar answered Nov 15 '22 16:11

Paul92