Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in sse

Compute the absolute difference between unsigned integers using SSE

c++ unsigned sse

Horizontal minimum and maximum using SSE

c++ max sse minimum avx

Using SIMD on amd64, when is it better to use more instructions vs. loading from memory?

How do you populate an x86 XMM register with 4 identical floats from another XMM register entry?

c++ c x86 inline-assembly sse

How to allocate 16byte memory aligned data

c memory sse icc

What is the fastest way to test if a double number is integer (in modern intel X86 processors)

c optimization assembly x86 sse

Fast counting the number of set bits in __m128i register

c sse simd sse2 hammingweight

Using SSE instructions with gcc without inline assembly

c x86-64 sse simd intrinsics

Can CUDA use SIMD extensions?

cuda gpu sse simd vectorization

Intel SSE: Why does `_mm_extract_ps` return `int` instead of `float`?

c sse simd

How to negate (change sign) of the floating point elements in a __m128 type variable?

c x86 vectorization sse simd

How to divide 16-bit integer by 255 with using SSE?

SSE multiplication 16 x uint8_t

x86 sse simd sse4

Why does vectorization behave differently for almost the same code?

Computing Hamming distances to several strings with SSE

c gcc sse simd hamming-distance

SSE register return with SSE disabled

c gcc floating-point sse

Looking for sse 128 bit shift operation for non-immediate shift value

c++ c sse

Which versions of Windows support/require which CPU multimedia extensions? (How to check if SSE or AVX are fully usable?)

windows assembly sse avx avx512

Why are there 128bit load functions for SSE?

c++ x86 sse simd intrinsics

Look-Up Table using SIMD

c++ sse simd