I do some explicitly vectorised computations using SSE types, such as __m128
(defined in xmmintrin.h
etc), but now I need to raise all elements of the vector to some (same) power, i.e. ideally I would want something like __m128 _mm_pow_ps(__m128, float)
, which unfortunately doesn't exist.
What is the best way around this? I could store the vector, call std::pow
on each element, and then reload it. Is this the best I can do? How do compilers implement a call to std::pow
when auto-vectorising code that otherwise is well vectorisable? Are there any libraries that provide something useful?
(note that this question is related by not a duplicate and certainly doesn't have a useful answer.)
Use the formula exp(y*log(x))
for pow(x, y)
and a library with SSE implementations of exp()
and log()
.
Edit by @Royi:
The above holds only for cases both x
and y
are positive. Otherwise more carefull Math is needed. See https://math.stackexchange.com/questions/2089690.
I really recommend the Intel Short Vector Math Library for these types of operations. The library is bundled with the Intel compiler which you mention in the list of compilers to support. I doubt it would be useful for gcc and clang but it could serve as a reference point for benchmarking wherever pow implementation you come up with.
https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-DEB8B19C-E7A2-432A-85E4-D5648250188E.htm
An AVX version of the ssemath library is now available: http://software-lisc.fbk.eu/avx_mathfun/
with the library you can use:
exp256_ps(y*log256_ps(x)); // for pow(x, y)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With