Say I have a function that writes to a file. I also have a function that loops repeatedly reading from said file. I have both of these functions running in separate threads. (Actually I am reading/writing to registers via MDIO which is why I can't have both threads executing concurrently, only one or the other, but for the sake of simplicity, let's just say it's a file)
Now when I run the write function in isolation, it executes fairly quickly. However when I'm running threaded and have it acquire a lock before running, it seems to run extremely slow. Is this because the second thread (read function) is polling to acquire the lock? Is there any way to get around this?
I am currently just using a simple RLock, but am open to any change that would increase performance.
Edit: As an example, I will put a basic example of what's going on. The read thread is basically always running, but occasionally a separate thread will make a call to load. If I benchmark by running load from cmd prompt, running in a thread is at least 3x slower.
write thread:
import usbmpc # functions I made which access dll functions for hardware, etc
def load(self, lock):
lock.acquire()
f = open('file.txt','r')
data = f.readlines()
for x in data:
usbmpc.write(x)
lock.release()
read thread:
import usbmpc
def read(self, lock):
addr = START_ADDR
while True:
lock.acquire()
data = usbmpc.read(addr)
lock.release()
addr += 4
if addr > BUF_SIZE: addr = START_ADDR
This is due to the Python GIL being the bottleneck preventing threads from running completely concurrently. The best possible CPU utilisation can be achieved by making use of the ProcessPoolExecutor or Process modules which circumvents the GIL and make code run more concurrently.
Multithreading in Python streamlines the efficient utilization of resources as the threads share the same memory and data space. It also allows the concurrent appearance of multiple tasks and reduces the response time. This improves the performance.
Once a thread has acquired the lock, all subsequent attempts to acquire the lock are blocked until it is released. The lock can be released using the release() method. Calling the release() method on a lock, in an unlocked state, results in an error.
In fact, a Python process cannot run threads in parallel but it can run them concurrently through context switching during I/O bound operations. This limitation is actually enforced by GIL. The Python Global Interpreter Lock (GIL) prevents threads within the same process to be executed at the same time.
Do you use threading on a multicore machine?
If the answer is yes, then unless your Python version is 3.2+ you are going to suffer reduced performance when running threaded applications.
David Beazly has put considerable effort to find what is going on with GIL on multicores and has made it easy for the rest of us to understand it too. Check his website and the resources there. Also you might want to see his presentation at PyCon 2010. It is rather intresting.
To make a long story short, in Python 3.2, Antoine Pitrou wrote a new GIL that has the same performance on single and multicore machines. In previous versions, the more cores/threads you have, the performance loss increases...
hope it helps :)
Why aren't you acquiring the lock in the writer for the duration of each write only? You're currently locking for the entire duration of the load function, the reader never gets in until the load function is completely done.
Secondly, you should be using context locks. Your current code is not thread safe:
def load(lock):
for x in data:
with lock:
whatever.write(x)
The same goes for your reader. Use a context to hold the lock.
Thirdly, don't use an RLock
. You know you don't need one, at no point does your read/write code need to reacquire, so don't give it that opportunity, you will be masking bugs.
The real answer is in several of the comments to your question: The GIL is causing some contention (assuming it isn't actually your misuse of locking). The Python threading
module is fantastic, the GIL sometimes is not, but moreso, the complex behaviours it generates that are misunderstood. It's worth mentioning though that the mainstream belief that throwing threads at problems is not the panacea people believe it to be. It usually isn't the solution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With