I'm looking for a SIMD library focused small (4x4) matrix operations for graphics. There's lots of single precision ones out there, but I need to support both single and double precision.
I've looked at Intel's IPP MX library, but I'd prefer something with source. I'm very interested in SSE3+ implementations of these particular operations:
EDIT: No "premature optimization" answers please. Anyone who has worked with small matrices knows GCC does not vectorize these as well as hand optimized intrinsics or ASM. And in this case it's important, or I wouldn't be asking.
Maybe the Eigen library?
It supports SSE 2/3/4, ARM NEON and AltiVec instruction set.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With