I'm using pretty standard Threading.Event: Main thread gets to a point where its in a loop that runs:
event.wait(60)
The other blocks on a request until a reply is available and then initiates a:
event.set()
I would expect the main thread to select for 40 seconds, but this is not the case. From the Python 2.7 source Lib/threading.py:
# Balancing act: We can't afford a pure busy loop, so we
# have to sleep; but if we sleep the whole timeout time,
# we'll be unresponsive. The scheme here sleeps very
# little at first, longer as time goes on, but never longer
# than 20 times per second (or the timeout time remaining).
endtime = _time() + timeout
delay = 0.0005 # 500 us -> initial delay of 1 ms
while True:
gotit = waiter.acquire(0)
if gotit:
break
remaining = endtime - _time()
if remaining <= 0:
break
delay = min(delay * 2, remaining, .05)
_sleep(delay)
What we get is a select syscall run every 500us. This causes noticeable load on the machine with a pretty tight select loop.
Can someone please explain why there is a balancing act involved and why is it different than a thread waiting on a file descriptor.
and second, Is there a better way to implement a mostly sleeping main thread without such a tight loop?
I recently got hit by the same problem, and I also tracked it down to this exact block of code in the threading
module.
It sucks.
The solution would be to either overload the threading module, or migrate to python3
, where this part of the implementation has been fixed.
In my case, migrating to python3 would have been a huge effort, so I chose the former. What I did was:
.so
file (using cython
) with an interface to pthread
. It includes python functions which invoke the corresponding pthread_mutex_*
functions, and links against libpthread
. Specifically, the function most relevant to the task we're interested in is pthread_mutex_timedlock.threading2
module, (and replaced all import threading
lines in my codebase with import threading2
). In threading2
, I re-defined all the relevant classes from threading
(Lock
, Condition
, Event
), and also ones from Queue
which I use a lot (Queue
and PriorityQueue
). The Lock
class was completely re-implemented using pthread_mutex_*
functions, but the rest were much easier -- I simply subclassed the original (e.g. threading.Event
), and overridden __init__
to create my new Lock
type. The rest just worked.The implementation of the new Lock
type was very similar to the original implementation in threading
, but I based the new implemenation of acquire
on the code I found in python3
's threading
module (which, naturally, is much simpler than the abovementioned "balancing act" block). This part was fairly easy.
(Btw, the result in my case was 30% speedup of my massively-multithreaded process. Even more than I expected.)
I totally agree with you, this is lame.
Currently, I'm sticking with a simple select call, without timeout, and listening on a pipe created before. The wakeup is done by writing a character in the pipe.
See this sleep and wakeup functions from gunicorn.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With