Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

New posts in avx2

Fully utilizing pipelines on kaby lake

Nov 16, 2022

performance assembly x86-64 micro-optimization avx2

How to concatenate two vector efficiently using AVX2? (a lane-crossing version of VPALIGNR)

Mar 08, 2022

c simd intrinsics avx avx2

Counting 1 bits (population count) on large data using AVX-512 or AVX-2

Mar 19, 2022

assembly avx2 avx512 bitcount population-count

Shifting SSE/AVX registers 32 bits left and right while shifting in zeros

Nov 27, 2018

x86 sse simd avx avx2

Efficient way of rotating a byte inside an AVX register

Mar 06, 2022

c sse simd avx avx2

Count leading zero bits for each element in AVX2 vector, emulate _mm256_lzcnt_epi32

Mar 17, 2022

bit-manipulation simd avx avx2 avx512

Optimal SIMD algorithm to rotate or transpose an array

Nov 26, 2020

assembly intel simd transpose avx2

Fast modulo-12 algorithm for 4 uint16_t's packed in a uint64_t

May 15, 2022

c algorithm vectorization modulo avx2

What do you do without fast gather and scatter in AVX2 instructions?

Sep 05, 2022

algorithm performance optimization simd avx2

How to implement an efficient _mm256_madd_epi8?

Nov 09, 2021

c++ x86 simd intrinsics avx2

Efficient implementation of log2(__m256d) in AVX2

Sep 16, 2022

c++ algorithm floating-point logarithm avx2

Parallel programming using Haswell architecture [closed]

Apr 12, 2015

sse cpu-architecture avx avx2

How can I add together two SSE registers

Oct 04, 2022

c++ c intel sse avx2

Efficient way to set first N or last N bits of __m256i to 1, the rest to 0

Oct 31, 2020

c++ bit-manipulation vectorization x86-64 avx2

Fastest way to unpack 32 bits to a 32 byte SIMD vector

Jan 01, 2017

x86 simd avx bitmask avx2

Do all CPUs which support AVX2 also support SSE4.2 and AVX?

Nov 13, 2022

sse simd avx avx2

AVX2 slower than SSE on Haswell

May 18, 2017

c++ x86 sse simd avx2

Is this incorrect code generation with arrays of __m256 values a clang bug?

Mar 12, 2019

c++ clang compiler-optimization avx2

« Newer Entries Older Entries »