Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in intrinsics

SSE2 code optimization

c++ sse simd intrinsics sse2

How to square two complex doubles with 256-bit AVX vectors?

SSE rounds down when it should round up

How do non temporal instructions work?

Way to effectively call _BitScanReverse or __builtin_clz in constexpr functions?

How to instruct compiler to generate unaligned loads for __m128

c++ x86-64 sse simd intrinsics

Simple C++ expression templates wrapping intrinsics produces different instructions

c++ intrinsics

How to implement an efficient _mm256_madd_epi8?

c++ x86 simd intrinsics avx2

How can I get an intrinsic for the exp() function in x64 code?

Computing 8 horizontal sums of eight AVX single-precision floating-point vectors

Efficiently gather individual bytes, separated by a byte-stride of 4

c intrinsics avx

Testing for builtins/intrinsics

c gcc intrinsics

_addcarry_u64 and _addcarryx_u64 with MSVC and ICC

Compile C++ code with AVX2/AVX512 intrinsics on AVX

Truth-table reduction to ternary logic operations, vpternlog

Check XMM register for all zeroes

c++ sse simd intrinsics

AVX log intrinsics (_mm256_log_ps) missing in g++-4.8?

c++ g++ intrinsics avx

Why do java intrinsic functions still have code?

java intrinsics

Does Clang have something like #pragma GCC target?

clang intrinsics avx pragma

Most efficient way to check if all __m128i components are 0 [using <= SSE4.1 intrinsics]

c++ integer sse simd intrinsics