Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in sse2

Detect the availability of SSE/SSE2 instruction set in Visual Studio

c++ visual-studio x86 sse sse2

SIMD code runs slower than scalar code

c optimization sse simd sse2

How to convert scalar code of the double version of VDT's Pade Exp fast_ex() approx into SSE2?

c++ sse intrinsics sse2 exp

Scaling byte pixel values (y=ax+b) with SSE2 (as floats)?

c++ visual-studio x86 simd sse2

How to store the contents of a __m128d simd vector as doubles without accessing it as a union?

c x86 simd intrinsics sse2

How to optimize a cycle?

SIMD: Why is the SSE RGB to YUV color conversion about the same speed as the c++ implementation?

c++ optimization rgb yuv sse2

Is SSE2 signed integer overflow undefined?

How to extract bytes from an SSE2 __m128i structure?

SIMD array add for arbitrary array lengths

c arrays sse simd sse2

Simulating packusdw functionality with SSE2

x86 sse intrinsics sse2 sse4

Using XMM0 register and memory fetches (C++ code) is twice as fast as ASM only using XMM registers - Why?

Fastest way to perform AVX inner product operations with mixed (float, double) input vectors

c++ vectorization simd avx sse2

Best way to load/store from/to general purpose registers to/from xmm/ymm register

assembly x86 simd sse2 avx2

Why does V8 in Node.js 0.12.0 release require SSE2 CPU instructions?

node.js v8 sse2

What is __m128d?

c++ intel intrinsics sse2

What does the following assembly instruction do addsd -8(%rbp), %xmm0?

SSE multiplication of 2 64-bit integers

x86 sse simd multiplication sse2