Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in avx

Non-temporal stores of portions of a packed double vector using SSE/AVX

caching x86 x86-64 sse avx

What is the minimum version of OS X for use with AVX/AVX2?

macos sse avx avx2

How to set all elements in a __m256d to, say, the 3rd element of another __m256d?

sse avx

gdb printing a __m256i as 8x 32-bit elements instead of the default 4x 64-bit?

integer gdb intrinsics avx

x86 CPU Dispatching for SSE/AVX in C++

x86 sse simd avx

AV512: Best way to combine horizontal sum and broadcast

c intel avx avx512

Rotating (by 90°) a bit matrix (up to 8x8 bits) within a 64-bit integer

How to find the index of an element in the AVX vector?

x86 intrinsics avx

Why do bit manipulation intrinsics like _bextr_u64 often perform worse than simple shift and mask operations?

How to sum all 32-bit or 64-bit sub-registers in an SSE XMM, or AVX YMM, and ZMM register?

sse simd avx

Using sse and avx intrinsics to add a set of packed singles into one value

c++ c++11 sse avx

Optimal uint8_t bitmap into a 8 x 32bit SIMD "bool" vector

c++11 simd avx avx2

Websocket data unmasking / multi byte xor

c x86 sse simd avx

Does VS2010 SP1 support only part of the AVX instruction set?

Difference between _mm256_xor_si256() and _mm256_xor_ps()

intrinsics avx avx2

C++ AVX2 Instrinsic function Non-Standard Size

c++ simd intrinsics avx avx2

Different semantic of comparison intrinsic instructions in avx512?

c++ sse intrinsics avx avx512

Integer dot product using SSE/AVX?

c++ vectorization sse simd avx