Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in micro-optimization

Which sequence of instructions has better performance for zeroing one register or another?

When source registers in avx instruction can be reused

Using Intrinsics to Extract And Shift Odd/Even Bits

Should I aggressively release memory while reading a file line by line in Perl?

Long latency instruction

Why is this reordering of sub and mul instructions helpful?

3D Morton code computation utilizing carry-less multiplication

x86 assembly - optimization of clamping rax to [ 0 .. limit )

Why does some Windows booloader code zero registers with `sub` instead of `xor`?

Does it make sense to use a relaxed load followed by a conditional fence, if I don't always need acquire semantics?

How can I resolve data dependency in pointer arrays?

Optimize lookup tables to simple ALU

A checklist for Spacy optimization?

Why dependency in a loop iteration can't be executed together with the previous one

Is it possible to make MSVC's __assume(0) aka std::unreachable() actually optimize?

Why doesn’t Clang use vcnt for __builtin_popcountll on AArch32?