Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in sse

Dot product performance with SSE instructions: is DPPS worth using?

Can I use SIMD intrinsics for software that runs on cloud?

x86 cloud sse simd

X86: How to set lower half of xmm0 to 0, without affecting the upper half?

In JWASM/MASM - pshufw produces Error A2030: Instruction or register not accepted in current CPU mode

assembly x86 masm sse mmx

AVX2: U8 absolute difference

sse simd neon avx avx2

How can I do efficiently bitwise majority voting on 3, 5, 7, 9 inputs with SSE/SSE2/AVX/...?

assembly sse avx neon avx512

Convention for displaying vector registers

x86 sse simd avx

gcc (6.1.0) using 'wrong' instructions in SSE intrinsics

c gcc sse intrinsics

Loading an xmm from GP regs

SIMD: Bit-pack signed integers

sse simd avx avx2 avx512

Count number of matching bytes between two _m128i SIMD vectors

Emscripten: how can i compile a c file with an intrinsic header like immintrin.h?

How to get the number of unique elements of a simd vector in C

c simd sse

First use of AVX 256-bit vectors slows down 128-bit vector and AVX scalar ops

assembly x86-64 sse simd avx

Aligning memory on 16-byte and 32-byte boundaries

memory alignment sse simd avx

Why is masking needed before using a pshufb shuffle as a lookup table for nibbles?

c++ simd sse avx avx2

performance of SSE and AVX when both Memory-band width limited

performance caching sse avx

Set an XMM register to a repeating byte pattern (broadcast a constant byte)

How to best emulate the logical meaning of _mm_slli_si128 (128-bit bit-shift), not _mm_bslli_si128

c sse simd intrinsics sse2

Aliasing of NEON vector data types

c++ c sse simd neon