This is a situation I encounter frequently as an inexperienced programmer and am wondering about particularly for an ambitious, speed-intensive project of mine I'm trying to optimize. For the major C-like languages (C, objC, C++, Java, C#, etc) and their usual compilers, will these two functions run just as efficiently? Is there any difference in the compiled code?
void foo1(bool flag)
{
if (flag)
{
//Do stuff
return;
}
//Do different stuff
}
void foo2(bool flag)
{
if (flag)
{
//Do stuff
}
else
{
//Do different stuff
}
}
Basically, is there ever a direct efficiency bonus/penalty when break
ing or return
ing early? How is the stackframe involved? Are there optimized special cases? Are there any factors (like inlining or the size of "Do stuff") that could affect this significantly?
I'm always a proponent of improved legibility over minor optimizations (I see foo1 a lot with parameter validation), but this comes up so frequently that I'd like to set aside all worry once and for all.
And I'm aware of the pitfalls of premature optimization... ugh, those are some painful memories.
EDIT: I accepted an answer, but EJP's answer explains pretty succinctly why the use of a return
is practically negligible (in assembly, the return
creates a 'branch' to the end of the function, which is extremely fast. The branch alters the PC register and may also affect the cache and pipeline, which is pretty minuscule.) For this case in particular, it literally makes no difference because both the if/else
and the return
create the same branch to the end of the function.
There is no difference at all:
=====> cat test_return.cpp
extern void something();
extern void something2();
void test(bool b)
{
if(b)
{
something();
}
else
something2();
}
=====> cat test_return2.cpp
extern void something();
extern void something2();
void test(bool b)
{
if(b)
{
something();
return;
}
something2();
}
=====> rm -f test_return.s test_return2.s
=====> g++ -S test_return.cpp
=====> g++ -S test_return2.cpp
=====> diff test_return.s test_return2.s
=====> rm -f test_return.s test_return2.s
=====> clang++ -S test_return.cpp
=====> clang++ -S test_return2.cpp
=====> diff test_return.s test_return2.s
=====>
Meaning no difference in generated code whatsoever even without optimization in two compilers
The short answer is, no difference. Do yourself a favour and stop worrying about this. The optimising compiler is almost always smarter than you.
Concentrate on readability and maintainability.
If you want to see what happens, build these with optimisations on and look at the assembler output.
Interesting answers: Although I do agree with all of them (so far), there are possible connotations to this question that are up to now completely disregarded.
If the simple example above is extended with resource allocation, and then error checking with a potential resulting freeing of resources, the picture might change.
Consider the naive approach beginners might take:
int func(..some parameters...) {
res_a a = allocate_resource_a();
if (!a) {
return 1;
}
res_b b = allocate_resource_b();
if (!b) {
free_resource_a(a);
return 2;
}
res_c c = allocate_resource_c();
if (!c) {
free_resource_b(b);
free_resource_a(a);
return 3;
}
do_work();
free_resource_c(c);
free_resource_b(b);
free_resource_a(a);
return 0;
}
The above would represent an extreme version of the style of returning prematurely. Notice how the code becomes very repetitive and non-maintainable over time when its complexity grows. Nowadays people might use exception handling to catch these.
int func(..some parameters...) {
res_a a;
res_b b;
res_c c;
try {
a = allocate_resource_a(); # throws ExceptionResA
b = allocate_resource_b(); # throws ExceptionResB
c = allocate_resource_c(); # throws ExceptionResC
do_work();
}
catch (ExceptionBase e) {
# Could use type of e here to distinguish and
# use different catch phrases here
# class ExceptionBase must be base class of ExceptionResA/B/C
if (c) free_resource_c(c);
if (b) free_resource_b(b);
if (a) free_resource_a(a);
throw e
}
return 0;
}
Philip suggested, after looking at the goto example below, to use a break-less switch/case inside the catch block above. One could switch(typeof(e)) and then fall through the free_resourcex()
calls but this is not trivial and needs design consideration. And remember that a switch/case without breaks is exactly like the goto with daisy-chained labels below...
As Mark B pointed out, in C++ it is considered good style to follow the Resource Aquisition is Initialization principle, RAII in short. The gist of the concept is to use object instantiation to aquire resources. The resources are then automatically freed as soon as the objects go out of scope and their destructors are called. For interdepending resources special care has to be taken to ensure the correct order of deallocation and to design the types of objects such that required data is available for all destructors.
Or in pre-exception days might do:
int func(..some parameters...) {
res_a a = allocate_resource_a();
res_b b = allocate_resource_b();
res_c c = allocate_resource_c();
if (a && b && c) {
do_work();
}
if (c) free_resource_c(c);
if (b) free_resource_b(b);
if (a) free_resource_a(a);
return 0;
}
But this over-simplified example has several drawbacks: It can be used only if the allocated resources do not depend on each other (e.g. it could not be used for allocating memory, then opening a filehandle, then reading data from the handle into the memory), and it does not provide individial, distinguishable error codes as return values.
To keep code fast(!), compact, and easily readable and extensible Linus Torvalds enforced a different style for kernel code that deals with resources, even using the infamous goto in a way that makes absolutely sense:
int func(..some parameters...) {
res_a a;
res_b b;
res_c c;
a = allocate_resource_a() || goto error_a;
b = allocate_resource_b() || goto error_b;
c = allocate_resource_c() || goto error_c;
do_work();
error_c:
free_resource_c(c);
error_b:
free_resource_b(b);
error_a:
free_resource_a(a);
return 0;
}
The gist of the discussion on the kernel mailing lists is that most language features that are "preferred" over the goto statement are implicit gotos, such as huge, tree-like if/else, exception handlers, loop/break/continue statements, etc. And goto's in the above example are considered ok, since they are jumping only a small distance, have clear labels, and free the code of other clutter for keeping track of the error conditions. This question has also been discussed here on stackoverflow.
However what's missing in the last example is a nice way to return an error code. I was thinking of adding a result_code++
after each free_resource_x()
call, and returning that code, but this offsets some of the speed gains of the above coding style. And it's hard to return 0 in case of success. Maybe I'm just unimaginative ;-)
So, yes, I do think there is a big difference in the question of coding premature returns or not. But I also think it is apparent only in more complicated code that is harder or impossible to restructure and optimize for the compiler. Which is usually the case once resource allocation comes into play.
Even though this isn't much an answer, a production compiler is going to be much better at optimizing than you are. I would favor readability and maintainability over these kinds of optimizations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With