Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in sse

Meaning of XMM register values shown in Visual Studio debugger's register window

How to convert int 64 to int 32 with avx (but without avx-512)

simd sse avx

Why does __m128 cause alignment issues in a union with float x/y/z?

Out-of-range floating point to integer conversion breaks in VS2022 executable when linking VS2017 or VS2019 libraries

optimising column-wise maximum with SIMD

c++ sse simd intrinsics avx

Should you pass __m128 (and other register types) by reference or by copy?

c++ simd sse intrinsics

average operation ARM NEON

arm sse simd neon intrinsics

How to compile a project which requires SSE2 on MacBook with M1 chip?

Why is SIMD slower than scalar counterpart

assembly x86 sse simd

CVTTSD2SI - a truncating instruction - uses rounding with "inexact" results?

How to store 4 32 bit floats into one 128 bit xmm register?

assembly x86 x86-64 sse simd

gcc vector extensions don't work as stated in docs

gcc sse vectorization

How to move (up to) 16 single bytes into an XMM register?

assembly x86 intel sse simd

No insert and extract for float/double in SSE and AVX?

c++ floating-point sse simd avx

Auto-vectorize shuffle instruction

c sse avx2 auto-vectorization

Why won't simple code get auto-vectorized with SSE and AVX in modern compilers?

Reading SSE registers (XMM, YMM) in a signal handler

Why do x86 FP compares set CF like unsigned integers, instead of using signed conditions?

assembly x86 sse sse2 x87

Vectorization of modulo multiplication

c++ algorithm sse simd avx

Does RSQRTSS break the dependency on the destination register?