Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in intrinsics

SSE intrinsics - comparison if/else optimization

c++ sse intrinsics

How to use x86intrin.h

c gcc x86-64 intrinsics bmi

How to stop GCC from breaking my NEON intrinsics?

c gcc arm neon intrinsics

AVX 256-bit equivalent for _mm_load1_ps

simd intrinsics avx

How compilers treat SSE (or any) intrinsic functions?

Sorting 64-bit structs using AVX?

c++ intrinsics avx

SSE2 code optimization

c++ sse simd intrinsics sse2

How to square two complex doubles with 256-bit AVX vectors?

SSE rounds down when it should round up

How do non temporal instructions work?

Way to effectively call _BitScanReverse or __builtin_clz in constexpr functions?

How to instruct compiler to generate unaligned loads for __m128

c++ x86-64 sse simd intrinsics

Simple C++ expression templates wrapping intrinsics produces different instructions

c++ intrinsics

How to implement an efficient _mm256_madd_epi8?

c++ x86 simd intrinsics avx2

How can I get an intrinsic for the exp() function in x64 code?

Computing 8 horizontal sums of eight AVX single-precision floating-point vectors

Efficiently gather individual bytes, separated by a byte-stride of 4

c intrinsics avx

Testing for builtins/intrinsics

c gcc intrinsics

_addcarry_u64 and _addcarryx_u64 with MSVC and ICC

Compile C++ code with AVX2/AVX512 intrinsics on AVX