Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in micro-optimization

Why doesn't the C++ standard library utilize likely/unlikely attributes?

During thread contention how can I speed up this ConcurrentQueue implementation which uses ReaderWriterLockSlim over a regular Queue<T>

Understanding `_mm_prefetch`

X86: How to set lower half of xmm0 to 0, without affecting the upper half?

Bottleneck when using indexed addressing modes

Loading an xmm from GP regs

68000 Assembly – Build a String from Characters *not* Present in Another & Return Its Length (stack-passed params)

Access of struct member faster if located <128 bytes from start?

Does the llvm-bolt instrumentation mode result in less accurate BOLT profiles?

How do you reason about fluctuations in benchmarking data?

Fastest way to set highest order bit of rax register to lowest order bit in rdx register

Optimized 53->32 bit modulo computation on 32-bit processors

Set an XMM register to a repeating byte pattern (broadcast a constant byte)

Performance / Space implications when ordering SQL Server columns?

Using the operand-size override prefix 0x66 for instruction alignment

Assembly function address table and data under the function or in data section

Fastest way to set a single memory cell to zero or a constant in x86 assembly?

How to exchange between 2 bits in a 1-byte number

Bit packing of groups of n repeated bits in a 32-bit word, compact to 1 bit per group

Can the compiler/JIT optimize away short-circuit evaluation if there are no side-effects?