Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in x86

How to implement an efficient _mm256_madd_epi8?

c++ x86 simd intrinsics avx2

MOVSD performance depends on arguments

Combining prefixes in SSE

Assembly language programming hints and tips [closed]

assembly x86 nasm

Process unaligned part of a double array, vectorize the rest

c++ c x86 vectorization sse

How does the NEG instruction affect the flags on x86?

Why are these 8 byte-writes not optimized into a MOV?

How to solve qemu gdb debug error: Remote 'g' packet reply is too long?

gcc x86 gdb qemu osdev

Which x86 instruction has a 10-byte immediate?

Outputting integers in assembly on Linux

assembly x86 nasm

Why does Hyper-threading get reported as supported on processors without it?

x86 intel hyperthreading cpuid

How can I detect when Android x86 is emulating ARM?

android-ndk x86 arm

Automatically generate FMA instructions in MSVC

c++ visual-c++ x86 avx fma

How to ask GCC to completely unroll this loop (i.e., peel this loop)?

c gcc x86 hpc loop-unrolling

Can the LSD issue uOPs from the next iteration of the detected loop?

Utilizing the LDT (Local Descriptor Table)

c assembly x86

Why do I get a different SHA1 hash between Powershell and 32bit-Python on a system DLL?

python powershell x86 64-bit

Why is one of these sooooo much faster than the other?

Is it legal to optimize away stores/construction of volatile stack variables?

How to force NASM to encode [1 + rax*2] as disp32 + index*2 instead of disp8 + base + index?