Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do macros in C++ improve performance?

I'm a beginner in C++ and I've just read that macros work by replacing text whenever needed. In this case, does this mean that it makes the .exe run faster? And how is this different than an inline function?

For example, if I have the following macro :

#define SQUARE(x) ((x) * (x))

and normal function :

int Square(const int& x)
{
    return x*x;
}

and inline function :

inline int Square(const int& x)
{
    return x*x;
}

What are the main differences between these three and especially between the inline function and the macro? Thank you.

like image 465
ggluta Avatar asked Mar 25 '16 00:03

ggluta


3 Answers

You should avoid using macros if possible. Inline functions are always the better choice, as they are type safe. An inline function should be as fast as a macro (if it is indeed inlined by the compiler; note that the inline keyword is not binding but just a hint to the compiler, which may ignore it if inlining is not possible).

PS: as a matter of style, avoid using const Type& for parameter types that are fundamental, like int or double. Simply use the type itself, in other words, use

int Square(int x)

since a copy won't affect (or even make it worse) performance, see e.g. this question for more details.

like image 132
vsoftco Avatar answered Nov 06 '22 20:11

vsoftco


Macros translate to: stupid replacing of pattern A with pattern B. This means: everything happens before the compiler kicks in. Sometimes they come in handy; but in general, they should be avoided. Because you can do a lot of things, and later on, in the debugger, you have no idea what is going on.

Besides: your approach to performance is well, naive, to say it friendly. First you learn the language (which is hard for modern C++, because there are a ton of important concepts and things one absolutely need to know and understand). Then you practice, practice, practice. And then, when you really come to a point where your existing application has performance problems; then do profiling to understand the real issue.

In other words: if you are interested in performance, you are asking the wrong question. You should worry much more about architecture (like: potential bottlenecks), configuration (in the sense of latency between different nodes in your system), and so on. Of course, you should apply common sense; and not write code that is obviously wasting memory or CPU cycles. But sometimes a piece of code that runs 50% slower ... might be 500% easier to read and maintain. And if execution time is then 500ms, and not 250ms; that might be totally OK (unless that specific part is called a thousand times per minute).

like image 37
GhostCat Avatar answered Nov 06 '22 20:11

GhostCat


The difference between a macro and an inlined function is that a macro is dealt with before the compiler sees it.

On my compiler (clang++) without optimisation flags the square function won't be inlined. The code it generates looks like this

4009f0:       55                      push   %rbp
4009f1:       48 89 e5                mov    %rsp,%rbp
4009f4:       89 7d fc                mov    %edi,-0x4(%rbp)
4009f7:       8b 7d fc                mov    -0x4(%rbp),%edi
4009fa:       0f af 7d fc             imul   -0x4(%rbp),%edi
4009fe:       89 f8                   mov    %edi,%eax
400a00:       5d                      pop    %rbp
400a01:       c3                      retq   

the imul is the assembly instruction doing the work, the rest is moving data around. code that calls it looks like

  400969:       e8 82 00 00 00          callq  4009f0 <_Z6squarei>

iI add the -O3 flag to Inline it and that imul shows up in the main function where the function is called from in C++ code

0000000000400a10 <main>:
400a10:       41 56                   push   %r14
400a12:       53                      push   %rbx
400a13:       50                      push   %rax
400a14:       48 8b 7e 08             mov    0x8(%rsi),%rdi
400a18:       31 f6                   xor    %esi,%esi
400a1a:       ba 0a 00 00 00          mov    $0xa,%edx
400a1f:       e8 9c fe ff ff          callq  4008c0 <strtol@plt>
400a24:       48 89 c3                mov    %rax,%rbx
400a27:       0f af db                imul   %ebx,%ebx

It's a reasonable thing to do to get a basic handle on assembly language for your machine and use gcc -S on your source, or objdump -D on your binary (as I did here) to see exactly what is going on.

Using the macro instead of the inlined function gets something very similar

0000000000400a10 <main>:
400a10:       41 56                   push   %r14
400a12:       53                      push   %rbx
400a13:       50                      push   %rax
400a14:       48 8b 7e 08             mov    0x8(%rsi),%rdi
400a18:       31 f6                   xor    %esi,%esi
400a1a:       ba 0a 00 00 00          mov    $0xa,%edx
400a1f:       e8 9c fe ff ff          callq  4008c0 <strtol@plt>
400a24:       48 89 c3                mov    %rax,%rbx
400a27:       0f af db                imul   %ebx,%ebx

Note one of the many dangers here with macros: what does this do ?

x = 5; std::cout << SQUARE(++x) << std::endl; 

36? nope, 42. It becomes

std::cout << ++x * ++x << std::endl; 

which becomes 6 * 7

Don't be put off by people telling you not to care about optimisation. Using C or C++ as your language is an optimisation in itself. Just try to work out if you're wasting time with it and be sensible.

like image 1
Hal Avatar answered Nov 06 '22 20:11

Hal