Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in simd
Initialize __m256i from 64 high or low bits of four __m128i variables
May 20, 2026
c++
sse
simd
avx
avx2
Intel AVX inconsistent _mm256_load_si256 integer operation in C
May 20, 2026
c
x86
simd
intrinsics
avx
Realistic deadlock example in CUDA/OpenCL
May 19, 2026
synchronization
cuda
parallel-processing
opencl
simd
cuda SIMD instruction for per-byte multiplication with unsigned saturation
May 16, 2026
c
cuda
multiplication
simd
saturation-arithmetic
What is the difference between AVX2 and AVX-512?
May 13, 2026
opencl
simd
avx
avx2
avx512
SSE4.1 slower than SSE3 on 4x4 matrix multiplication?
May 10, 2026
c++
matrix
simd
sse
matmul
Twice as slow SIMD performance without extra copy
May 02, 2026
assembly
x86-64
simd
sse
amd-processor
SSE - Non-Existant haddsub intrinsic?
May 02, 2026
sse
simd
intrinsics
AVX(2)/SIMD way to get/set (to 1) a single bit in a 256 bit register
Apr 30, 2026
c++
bit-manipulation
simd
avx
avx2
quaternion multiplication with gcc vector extensions
May 01, 2026
c++
gcc
simd
quaternions
SSE: How to reduce a _m128i._i32[4] to _m128i._i8
Apr 30, 2026
c++
x86
sse
simd
How do the AVX(2) gather instructions actually compute the fetch address?
Apr 28, 2026
c++
simd
intrinsics
avx
avx2
SSE optimisation for a loop that finds zeros in an array and toggles a flag + updates another array
Apr 28, 2026
c++
optimization
x86
sse
simd
aarch64 xtn2 clearing lower half
Apr 26, 2026
assembly
simd
arm64
neon
armv8
Neon casting issue
Apr 27, 2026
arm
simd
neon
int32
uint8t
Square root of a OpenCV's grey image using SSE
Apr 28, 2026
c++
opencv
sse
simd
How do I take the average of a large floating point array precisely?
Apr 25, 2026
assembly
floating-point
precision
simd
avx
Older Entries »