Faster code:
#include <stdio.h>
#include <iostream>
long fib(int num)
{
if (num <= 1)
return 1;
else
return fib(num-1) + fib(num-2);
}
int main()
{
long res = fib(45);
printf("%li\n", res);
return 0;
}
Slower code:
#include <stdio.h>
long fib(int num)
{
if (num <= 1)
return 1;
else
return fib(num-1) + fib(num-2);
}
int main()
{
long res = fib(45);
printf("%li\n", res);
return 0;
}
The only difference between the two is the second line #include <iostream>
.
Both are compiled with clang++ 8.0.0-3, with the -O2 flag.
clang++-8 -O2 fib.cpp && time ./a.out # 3.59s
clang++-8 -O2 fib_io.cpp && time ./a.out # 3.15s
Edit:
It seems that the behavior changed after rebooting, with the iostream version being slower this time, which would make more sense.
I'm inclined to say that it was just a fluke, since I can't reproduce it anymore.
Thus the unbuffered nature of the iostreams and their synchronization with C-streams is the real reason why C++ iostreams are very slow. The line std::ios_base::sync_with_stdio(false); removes this very synchronization between C and C++ streams.
fstream write/read are about 2 times slower than FILE* write/read! And this while reading a big blob of data, without any parsing or other features of fstreams .
Iostream, when all of the files it includes, the files that those include, and so on and so forth, adds up to about 3000 lines. this should be a very simple piece of code, but iostream adds 3000+ lines to a marginal piece of code.
When you include #include <iostream>
, there is at least one side-effect: an instance of std::ios_base::Init
will have to be constructed and destructed (see C++ draft [ios.init]p1):
The class
Init
describes an object whose construction ensures the construction of the eight objects declared in<iostream>
([iostream.objects]) that associate file stream buffers with the standard C streams provided for by the functions declared in<cstdio>
.
An explanation about it from cppreference:
This class is used to ensure that the default C++ streams (
std::cin
,std::cout
, etc.) are properly initialized and destructed. The class tracks how many instances of it are created and initializes the C++ streams when the first instance is constructed as well as flushes the output streams when the last instance is destructed.The header
<iostream>
behaves as if it defines (directly or indirectly) an instance ofstd::ios_base::Init
with static storage duration: this makes it safe to access the standard I/O streams in the constructors and destructors of static objects with ordered initialization (as long as#include <iostream>
is included in the translation unit before these objects were defined)
This does not necessarily mean performance should be different (either better or worse). However, it means that your two programs are not equal, from the C++ Standard point of view.
Without looking at the actual implementation in a given standard library (or profiling it), we cannot know the detailed reason (feel free to do so and add an answer!).
Inspecting the generated code from clang on a Linux box (which seems to be your case), i.e. libstdc++
:
_GLOBAL__sub_I_a.cpp: # @_GLOBAL__sub_I_a.cpp
push rax
mov edi, offset std::__ioinit
call std::ios_base::Init::Init() [complete object constructor]
mov edi, offset std::ios_base::Init::~Init() [complete object destructor]
mov esi, offset std::__ioinit
mov edx, offset __dso_handle
pop rax
jmp # TAILCALL
Therefore, either std::ios_base::Init::Init()
or __cxa_atexit
have some side-effect that make the overall program faster for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With