Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in avx2

Accumulating Doubles Into Bins via intrinsics

c++ simd avx avx2

AVX2: Is there a way to implement _mm256_mul_epi8 function for a constant power of 2?

c++ simd intrinsics avx avx2

SIMD unpack 12-bit fields to 16-bit

Why is masking needed before using a pshufb shuffle as a lookup table for nibbles?

c++ simd sse avx avx2

AVX2 integer comparison for smaller equal

c integer compare avx avx2

Find Absolute in AVX

When is it correct to cast to __m256 instead of loading?

c++ casting simd avx2

Why does _mm256_unpacklo "jump" a double-word and where does it says so in the documentation?

c++ simd intrinsics avx2

Bitwise NOT/complement in AVX2 [duplicate]

Is there a fast way to convert a string of 8 ASCII decimal digits into a binary number?

c++ parsing simd avx2 atoi

Is vfmadd132pd slow on AMD Zen 3 architecture?

Auto-vectorize shuffle instruction

c sse avx2 auto-vectorization

Why performance for this index-of-max function over many arrays of 256 bytes is so slow on Intel i3-N305 compared to AMD Ryzen 7 3800X?

How to copy from an array to a Vector256 and vice versa based on the array index?

c# .net simd avx2

Intel FMA Instructions Offer Zero Performance Advantage

c assembly avx2 fma

Testing whether AVX register contains some equal integer numbers

c++ x86 simd avx avx2

AVX2 Transpose of a matrix represented by 8x __m256i registers

c x86 transpose simd avx2

How to swap 128-bit parts between two AVX2 vectors

c# c++ .net avx2

Transform random integers into range [min,max] without branching

c++ bit-manipulation simd avx2

How to turn on -mavx2 for only particular part of source code?

c++ gcc clang intrinsics avx2