Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiency of Bitwise XOR in c++ in comparison to more readable methods

I've recently been writing some code for a research project that I'm working on, where efficiency is very important. I've been considering scraping some of the regular methods I do things in and using bitwise XORs instead. What I'm wondering is if this will make if a difference (if I'm performing this operation say several million times) or if it's the same after I use 03 in g++.

The two examples that come to mind:

I had an instance where (I'm working with purely positive ints) I needed to change n to n-1 if n was odd or n to (n+1) if n was even. I figured I had a few options:

if(n%2) // or (n%2==0) and flip the order
    n=n-1
else
    n=n+1

or

n=n+2*n%2-1; //This of course was silly, but was the only non-bitwise 1 line I could come up with

Finally:

n=n^1;

All of the methods clearly do the same thing, but my feeling was that the third one would be the most efficient.

The next example is on a more general note. Say I'm comparing two positive integers, will one of these perform better than the others. Or will the difference really not be noticeable, even if I perform this operation several million times:

if(n_1==n_2)
if(! (n_1 ^ n_2) )
if( n_1 ^ n_2) else \do work here

Will the compiler just do the same operation in all of these instances? I'm just curious if there is an instance when I should use bitwise operations and not trust the compiler to do the work for me.

Fixed:In correct statement of problem.

like image 255
JSchlather Avatar asked Feb 22 '10 01:02

JSchlather


1 Answers

It's easy enough to check, just fire up your disassembler. Take a look:

f.c:

unsigned int f1(unsigned int n)
{
  n ^= 1;  
  return n;
}

unsigned int f2(unsigned int n)
{
  if (n % 2)
    n=n-1;
  else
    n=n+1;

  return n;
}

Build and disassemble:

$ cc -O3 -c f.c 
$ otool -tV f.o 
f.o:
(__TEXT,__text) section
_f1:
00  pushq   %rbp
01  movq    %rsp,%rbp
04  xorl    $0x01,%edi
07  movl    %edi,%eax
09  leave
0a  ret
0b  nopl    _f1(%rax,%rax)
_f2:
10  pushq   %rbp
11  movq    %rsp,%rbp
14  leal    0xff(%rdi),%eax
17  leal    0x01(%rdi),%edx
1a  andl    $0x01,%edi
1d  cmovel  %edx,%eax
20  leave
21  ret

It looks like f1() is a bit shorter, whether or not that matters in reality is up to some benchmarking.

like image 62
Carl Norum Avatar answered Oct 17 '22 00:10

Carl Norum