Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in avx

Can I generate AVX vectorized code using LLVM jit?

x86 llvm jit avx

find nan in array of doubles using simd

c nan sse simd avx

How to store lower or higher values from AVX/AVX2(YMM) register to memory like the SSE movlps/movhps does?

x86 sse simd avx avx2

Small branches in modern CPUs

SIMD minmag and maxmag

The indices of non-zero bytes of an SSE/AVX register

c++ c sse simd avx

perf report shows this function "__memset_avx2_unaligned_erms" has overhead. does this mean memory is unaligned?

c++ profiling avx perf avx2

Is using AVX2 can implement a faster processing of LZCNT on a word array?

How to make premultiplied alpha function faster using SIMD instructions?

c++ x86 sse simd avx

128-bit SSE counter?

AVX2, How to Efficiently Load Four Integers to Even Indices of a 256 Bit Register and Copy to Odd Indices?

x86 sse simd avx avx2

Using AVX instructions disables exp() optimization?

visual-c++ x86 exp avx

Assembly code/AVX instructions for multiplication of complex numbers. (GCC inline assembly)

What is the difference between MOVDQA and MOVNTDQA, and VMOVDQA and VMOVNTDQ for WB/WC marked region?

assembly x86 sse simd avx

AVX2 VPSHUFB emulation in AVX

x86 simd intrinsics avx

_mm_alignr_epi8 (PALIGNR) equivalent in AVX2

x86 simd intrinsics avx avx2

Setting __m256i to the value of two __m128i values

c sse simd avx

Loading 8 chars from memory into an __m256 variable as packed single precision floats

c++ sse simd avx avx2

Unknown type name __m256 - Intel intrinsics for AVX not recognized?

c++ c intel intrinsics avx

Shuffling by mask with Intel AVX

c++ sse simd intrinsics avx