Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in sse

SSE: How to reduce a _m128i._i32[4] to _m128i._i8

c++ x86 sse simd

Is there a way to increase a value in a xmm register?

assembly x86 addition sse

SSE optimisation for a loop that finds zeros in an array and toggles a flag + updates another array

c++ optimization x86 sse simd

What are the names and meanings of the intrinsic vector element types, like epi64x or pi32?

intel sse intrinsics sse2 mmx

Square root of a OpenCV's grey image using SSE

c++ opencv sse simd

What's the equivalent of vbroadcastsd for xmm registers?

assembly x86 sse avx

Comparison and Extraction using SSE

c++ c sse simd

How to check inf for AVX intrinsic __m256

c++ c sse intrinsics avx

float point multiplication: LOSING speed with AVX against SSE?

c++ performance sse avx

__m256d TRANSPOSE4 Equivalent?

c++ matrix sse transpose avx

Convert __m128d to double

c++ sse

Intel intrinsics : multiply interleaved 8bit values

c intel sse simd intrinsics

Enabling arch:SSE2 makes program slower

c++ sse

Are there Neon equivalents to Sse2 _mm_unpackhi/lo_epi32/64 and _mm_shuffle_epi8/32?

c++ arm sse simd neon

Convert __m128i value into std::tuple

c++ c++11 sse simd

AVX 3.6x slower than IA32 in simple benchmark involving <cmath> operations - why so? (VS2013)

c++ visual-studio sse simd avx

What is the fastest/best way to combine registers with arbitrary lane selections in AVX/SSE?

intel sse intrinsics avx

Do the higher level SSE flags imply the lower ones in GCC / clang?

gcc sse