Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in simd

How to write portable simd code for complex multiplicative reduction

c++ c gcc simd avx

How to implement atoi using SIMD?

c++ x86 sse simd atoi

Intel AVX: 256-bits version of dot product for double precision floating point variables

c++ performance simd avx

What are the best instruction sequences to generate vector constants on the fly?

assembly x86 sse simd avx

Crash with icc: can the compiler invent writes where none existed in the abstract machine?

How to check if compiled code uses SSE and AVX instructions?

c++ assembly x86 g++ simd

Why ARM NEON not faster than plain C++?

c++ arm simd neon cortex-a8

What's missing/sub-optimal in this memcpy implementation?

c optimization x86 simd avx

CPU SIMD vs GPU SIMD?

Why vectorizing the loop does not have performance improvement

c performance simd icc

Difference between MOVDQA and MOVAPS x86 instructions?

assembly x86 sse simd mov intel

Why is strcmp not SIMD optimized?

c++ sse simd strcmp sse2

AVX2 what is the most efficient way to pack left based on a mask?

c++ vectorization sse simd avx2

ARM Cortex-A8: Whats the difference between VFP and NEON

arm simd neon cortex-a8

How to determine if memory is aligned?

c optimization memory sse simd

Getting started with Intel x86 SSE SIMD instructions

c gcc x86 sse simd

SSE intrinsic functions reference

c++ c gcc sse simd

How to choose AVX compare predicate variants

simd avx

Parallel for vs omp simd: when to use each?

c++ c performance openmp simd

Fastest way to do horizontal SSE vector sum (or other reduction)