Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in simd

SIMD vs Vector architectures

Fastest way to unpack 32 bits to a 32 byte SIMD vector

x86 simd avx bitmask avx2

Do all CPUs which support AVX2 also support SSE4.2 and AVX?

sse simd avx avx2

Storing two x86 32 bit registers into 128 bit xmm register

assembly x86 simd sse

What are the 128-bit to 512-bit registers used for?

Most efficient way to check if all __m128i components are 0 [using <= SSE4.1 intrinsics]

c++ integer sse simd intrinsics

AVX2 slower than SSE on Haswell

c++ x86 sse simd avx2

How to convert a binary integer number to a hex string?

assembly x86 hex simd avx512

how to work with 128 bits C variable and xmm 128 bits asm?

c sse simd

SSE micro-optimization instruction order

approximating log10[x^k0 + k1]

Vectorize a function in clang

c++ vector simd clang++

gcc, simd intrinsics and fast-math concepts

gcc simd intrinsics fast-math

Packing and de-interleaving two __m256 registers

c++ x86 simd avx avx2

Why both? vperm2f128 (avx) vs vperm2i128 (avx2)

intel simd avx avx2

Is there a good double-precision small matrix SIMD library for x86?

Most efficient way to store 4 dot products into a contiguous array in C using SSE intrinsics

Fast counting the number of equal bytes between two arrays [duplicate]

c++ c sse simd sse2

Is it possible to use SIMD instructions in Rust?

rust simd avx avx2

Is it possible to vectorize myNum += a[b[i]] * c[i]; on x86_64?