Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in avx2

Convert signed short to float in C++ SIMD

c++ sse simd avx2

Fastest method to calculate sum of all packed 32-bit integers using AVX512 or AVX2

c intrinsics avx avx2 avx512

Is it really efficient to use Karatsuba algorithm in 64-bit x 64-bit multiplication?

Which is the reason for avx floating point bitwise logical operations?

c++ simd avx avx2

gdb reverse debugging avx2

c gdb glibc avx2

uint32_t * uint32_t = uint64_t vector multiplication with gcc

c gcc vectorization avx2 gcc9

Getting GCC to generate a PTEST instruction when using vector extensions

c gcc vectorization sse avx2

How to do _mm256_maskstore_epi8() in C/C++?

c++ simd intrinsics avx avx2

AVX2 byte gather with uint16 indices, into a __m256i

c intrinsics avx pack avx2

Efficient (on Ryzen) way to extract the odd elements of a __m256 into a __m128?

What is the floating-point (__m256d) version of the non-temporal streaming load intrinsic (_mm256_stream_load_si256)?

c++ x86 simd intrinsics avx2

Find the first instance of a character using simd

x86 sse simd avx avx2

AVX2 instructions latency and throughput

performance x86 x86-64 simd avx2

Intel IACA analyzer alters assembly?

assembly simd avx2 iaca

how verify that operating system support avx2 instructions

AVX2 sparse matrix multiplication

_mm256_slli_si256: error "last argument must be an 8-bit intermediate"

c gcc simd avx avx2

Why doesn't Intel design its SIMD ISAs in a more compatible or universal way?

intel simd avx avx2 avx512