Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AVX 256-bit equivalent for _mm_load1_ps

With SSE you can load a single float from memory into all 4 slots of a __m128 with the intrinsic _mm_load1_ps()

When using 256 bit wide SIMD with AVX, there seems to be no _mm256_load1_ps() to load a single float from memory into all 8 slots of the vector.

Why is this omission, and what's the best way around this?

Or even better: is there a way to load a single float to a targeted slot 0..7 of the vector?

like image 320
Bram Avatar asked Jun 13 '13 23:06

Bram


1 Answers

_mm256_broadcast_ss is what you are looking for.

like image 65
Marat Dukhan Avatar answered Oct 24 '22 17:10

Marat Dukhan