Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

New posts in avx2

Why do processors with only AVX out-perform AVX2 processors for many SIMD algorithms?

Sep 17, 2019

c# c++ simd avx avx2

Does /arch:AVX enable AVX2?

May 08, 2019

c++ visual-c++ visual-studio-2012 vectorization avx2

Best way to load/store from/to general purpose registers to/from xmm/ymm register

Nov 03, 2022

assembly x86 simd sse2 avx2

Fully utilizing pipelines on kaby lake

Nov 16, 2022

performance assembly x86-64 micro-optimization avx2

How to concatenate two vector efficiently using AVX2? (a lane-crossing version of VPALIGNR)

Mar 08, 2022

c simd intrinsics avx avx2

Counting 1 bits (population count) on large data using AVX-512 or AVX-2

Mar 19, 2022

assembly avx2 avx512 bitcount population-count

Shifting SSE/AVX registers 32 bits left and right while shifting in zeros

Nov 27, 2018

x86 sse simd avx avx2

Efficient way of rotating a byte inside an AVX register

Mar 06, 2022

c sse simd avx avx2

Count leading zero bits for each element in AVX2 vector, emulate _mm256_lzcnt_epi32

Mar 17, 2022

bit-manipulation simd avx avx2 avx512

Optimal SIMD algorithm to rotate or transpose an array

Nov 26, 2020

assembly intel simd transpose avx2

Fast modulo-12 algorithm for 4 uint16_t's packed in a uint64_t

May 15, 2022

c algorithm vectorization modulo avx2

What do you do without fast gather and scatter in AVX2 instructions?

Sep 05, 2022

algorithm performance optimization simd avx2

How to implement an efficient _mm256_madd_epi8?

Nov 09, 2021

c++ x86 simd intrinsics avx2

Efficient implementation of log2(__m256d) in AVX2

Sep 16, 2022

c++ algorithm floating-point logarithm avx2

Parallel programming using Haswell architecture [closed]

Apr 12, 2015

sse cpu-architecture avx avx2

How can I add together two SSE registers

Oct 04, 2022

c++ c intel sse avx2

Efficient way to set first N or last N bits of __m256i to 1, the rest to 0

Oct 31, 2020

c++ bit-manipulation vectorization x86-64 avx2

« Newer Entries Older Entries »