Just testing the two small programs,
#include <thread>
int main()
{
for (int i = 0; i < 10000000; i++)
{
std::this_thread::yield();
}
return 0;
}
and:
#include <thread>
#include <chrono>
int main()
{
using namespace std::literals;
for (int i = 0; i < 10000000; i++)
{
std::this_thread::sleep_for(0s);
}
return 0;
}
I get the respective timings on my system (Ubuntu 22.04 LTS, kernel version 5.19.0-43-generic),
./a.out 0,33s user 1,36s system 99% cpu 1,687 total
and:
./a.out 0,14s user 0,00s system 99% cpu 0,148 total
Why is std::this_thread::yield()
10x slower than std::this_thread::sleep_for(0s)
?
N.B. Timing is similar between g++ and clang++.
edit: As pointed out in the answer this is an optimization of the STL implementation, calling sleep(0)
is in fact 300x slower (50us vs 150ns).
Taking a quick look at the source for this_thread::sleep_for
template<typename _Rep, typename _Period>
inline void
sleep_for(const chrono::duration<_Rep, _Period>& __rtime)
{
if (__rtime <= __rtime.zero())
return;
...
So sleep_for(0s)
does nothing, in fact your test program uses 0.0s of system time, basically an empty loop that runs entirely in user space (in fact I suspect that if you compile with optimizations, it will be completely removed)
On the other hand, yield
calls* sched_yield
which in turns will call schedule()
in kernel space, thus at least executing some logic to check if there is another thread to schedule.
I believe that your 0.33s of user space time is basically syscall overhead.
* Actually jumps to __libcpp_thread_yield
which then calls sched_yield
, at least on linux
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With