Yes, I read SIMD code runs slower than scalar code. No, it's not really a duplicate.
I have been using 2D math stuff for a while, and in the process of porting my codebase from C to C++. There are a few walls I've hit with C that mean I really need polymorphism, but that's another story. Anyway, I considered this a while ago, but it presented a perfect opportunity to use a 2D vector class, including SSE implementations of the common math operations. Yes, I know there are libraries out there, but I wanted to try it myself to understand what's going on, and I don't use anything more complicated than +=
.
My implementation is via <immintrin.h>
, with a
union {
__m128d ss;
struct {
double x;
double y;
}
}
SSE seemed slow, so I looked at its generated ASM output. After fixing something stupid pointerwise, I ended up with the following sets of instructions, run a billion times in a loop: (Processor is an AMD Phenom II at 3.7GHz)
SSE enabled: 1.1 to 1.8 seconds (varies)
add $0x1, %eax
addpd %xmm0, %xmm1
cmp $0x3b9aca00, %eax
jne 4006c8
SSE disabled: 1.0 seconds (pretty constant)
add $0x1, %eax
addsd %xmm0, %xmm3
cmp $0x3b9aca00, %eax
addsd %xmm2, %xmm1
jne 400630
The only conclusion I can draw from this is that addsd
is faster than addpd
, and that pipelining means that the extra instruction is compensated for by the ability to do more faster things partially overlapping.
So my question is: is this worth it, and in practice will it actually help, or should I just not bother with the stupid optimization and let the compiler handle it in scalar mode?
This require more loop unrolling and maybe cache prefetching. Your arithmetic density is very low : 1 operation for 2 memory operations so you need to jam as much of these in your pipeline as possible.
Also don't use union but __m128d directly and use _mm_load_pd to fill your __m128 from your data. _m128 in union generate bad code where all element are doing a stack-register-stack dance which is detrimental.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With