Which is faster for bitwise NOT operation: precalculated table or `~`

Question

Theoretically, on modern CPUs which is faster:

receiving NOT result from table
or calculating it by ~ (in C) operation?

Presuming that all the table fits in L1 cache.

Bitwise not:

uint8_t bitwise_not(uint8_t arg) { return ~arg; }

Table not:

// precalculcating table (once)
uint8_t table[0x100];
for (int i = 0; i < 0x100; ++i) { table[i] = ~static_cast<uint8_t>(i); }

// function
uint8_t table_not(uint8_t arg) { return table[arg]; }

// xor_not:
uint8_t xor_not(uint8_t arg) { return arg ^ 0xff; }

On not a single operation, but several billions operations, is reading from L1 cache faster than any logical operation or not? (I think L1 is faster, but cannot prove it.)

Practically, how to measure it?

user207421 · Accepted Answer

Neither. Just use the ~ operator inline in your code. It's one machine instruction. A function call or a table lookup are several. There is no way either can possibly be faster.

I can't account for your strange belief that L1 cache is faster than registers.

Which is faster for bitwise NOT operation: precalculated table or `~`

Tags:

c++

cpu-architecture

cpu-cache

micro-optimization

vladon

1 Answers

user207421

Recent Activity

Donate For Us

Which is faster for bitwise NOT operation: precalculated table or `~`

Tags:

c++

cpu-architecture

cpu-cache

micro-optimization

vladon

1 Answers

user207421

Related questions

Recent Activity

Donate For Us