Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Am I breaking strict aliasing rules?

I would like to know if I'm breaking strict aliasing rules with this snippet. (I think so since it's dereferencing a punned-pointer, however it's done in a single expression and /Wall doesn't cry.)

inline double plop() const // member function
{
    __m128d x = _mm_load_pd(v);
    ... // some stuff
    return *(reinterpret_cast<double*>(&x)); // return the lower double in xmm reg referred to by x.
}

If yes, what's the workaround? Using different representations simultaneously is becoming hardcore once you want to respect the spec.

Thanks for your answers, I'm losing my good mood trying to find a solution.

Answers that won't be accepted and why:

"use mm_store" -> The optimizer fails to remove it if the following instructions require an xmm register so it generates a load just after it. Store + load for nothing.

"use a union" -> Aliasing rule violation if using the two types for the same object. If I understood well the article written by Thiago Macieira.

like image 802
PixelRick Avatar asked Apr 17 '14 18:04

PixelRick


2 Answers

The bullet point in bold should i think allow your cast here, as we may consider __m128d as an aggregate of four double union to the full register. In regards to strict aliasing, compiler had always be very conciliate around union where at the origin, only a cast to (char*) was supposed valid.

§3.10: If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined (The intent of this list is to specify those circumstances in which an object may or may not be aliased):

  • the dynamic type of the object,
  • a cv-qualified version of the dynamic type of the object,
  • a type similar (as defined in 4.4) to the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
  • a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
  • a char or unsigned char type.
like image 185
galop1n Avatar answered Nov 09 '22 16:11

galop1n


There is only one intrinsic that "extracts" the lower order double value from xmm register:

double _mm_cvtsd_f64 (__m128d a)

You could use it this way:

return _mm_cvtsd_f64(x);

There is some contradiction between different references. MSDN says: This intrinsic does not map to any specific machine instruction. While Intel intrinsic guide mentions movsd instruction. In latter case this additional instruction is easily eliminated by optimizer. At least gcc 4.8.1 with -O2 flag generates code with no additional instruction.

like image 4
Evgeny Kluev Avatar answered Nov 09 '22 16:11

Evgeny Kluev