Questions
Linux
Laravel
Mysql
Ubuntu
Git
Menu
HTML
CSS
JAVASCRIPT
SQL
PYTHON
PHP
BOOTSTRAP
JAVA
JQUERY
R
React
Kotlin
×
Linux
Laravel
Mysql
Ubuntu
Git
New posts in simd
AVX2, How to Efficiently Load Four Integers to Even Indices of a 256 Bit Register and Copy to Odd Indices?
Oct 07, 2018
x86
sse
simd
avx
avx2
Why are SIMD instructions not used in kernel?
Sep 10, 2022
linux-kernel
operating-system
linux-device-driver
simd
ispc
How to convert 32-bit float to 8-bit signed char? (4:1 packing of int32 to int8 __m256i)
Jan 24, 2022
c
x86
simd
intrinsics
avx2
Summing 3 lanes in a NEON float32x4_t
Dec 01, 2020
ios
arm
simd
neon
intrinsics
What is the difference between MOVDQA and MOVNTDQA, and VMOVDQA and VMOVNTDQ for WB/WC marked region?
Jun 09, 2020
assembly
x86
sse
simd
avx
AVX2 VPSHUFB emulation in AVX
Oct 01, 2018
x86
simd
intrinsics
avx
_mm_alignr_epi8 (PALIGNR) equivalent in AVX2
Sep 01, 2020
x86
simd
intrinsics
avx
avx2
How do you move 128-bit values between XMM registers?
Feb 17, 2020
assembly
simd
sse
Setting __m256i to the value of two __m128i values
Mar 30, 2019
c
sse
simd
avx
Loading 8 chars from memory into an __m256 variable as packed single precision floats
Jun 17, 2021
c++
sse
simd
avx
avx2
Shuffling by mask with Intel AVX
Mar 08, 2022
c++
sse
simd
intrinsics
avx
Control flow divergence in SIMT and SIMD
May 11, 2022
cuda
sse
simd
Are there SIMD(SSE / AVX) instructions in the x86-compatible accelerators Intel Xeon Phi?
Nov 02, 2022
intel
sse
simd
avx
intel-mic
Faster lookup tables using AVX2
May 07, 2022
algorithm
performance
optimization
sse
simd
Does using mix of pxor and xorps affect performance?
Aug 26, 2021
assembly
x86
sse
simd
Is there an efficient way to get the first non-zero element in an SIMD register using SIMD intrinsics?
Oct 23, 2022
x86
bit-manipulation
simd
intrinsics
avx
Using a variable to index a simd vector with _mm256_extract_epi32() intrinsic
Feb 26, 2022
simd
intrinsics
avx
avx2
Is casting to simd-type undefined behaviour in C++? [duplicate]
May 13, 2022
c++
sse
undefined-behavior
simd
intrinsics
What's the most efficient way to load and extract 32 bit integer values from a 128 bit SSE vector?
Nov 29, 2019
c
gcc
sse
simd
ARM and NEON can work in parallel?
Oct 30, 2018
arm
inline-assembly
simd
neon
cortex-a8
« Newer Entries
Older Entries »