Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in micro-optimization

array_push() vs. $array[] = .... Which is fastest? [duplicate]

How expensive is it to convert between int and double?

Cycles/cost for L1 Cache hit vs. Register on x86?

Is performance reduced when executing loops whose uop count is not a multiple of processor width?

Why date() works twice as fast if we set time zone from code?

Why does n++ execute faster than n=n+1?

Why does breaking the "output dependency" of LZCNT matter?

'Correct' unsigned integer comparison

Why are loops always compiled into "do...while" style (tail jump)?

Go: multiple len() calls vs performance?

x86_64 best way to reduce 64 bit register to 32 bit retaining zero or non-zero status

Can x86's MOV really be "free"? Why can't I reproduce this at all?

x > -1 vs x >= 0, is there a performance difference

Why does mulss take only 3 cycles on Haswell, different from Agner's instruction tables? (Unrolling FP loops with multiple accumulators)

Avoiding the overhead of C# virtual calls

fastest way to negate a number

Passing null pointer to placement new

Does calculating Sqrt(x) as x * InvSqrt(x) make any sense in the Doom 3 BFG code?

How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

Why does Intel's compiler prefer NEG+ADD over SUB?