Like should I manually add locks when futures write the same file to guarantee they write it one by one?
I mean the concurrent.futures.ThreadPoolExecutor
And I know the java executor is thread safe
An Example:
def task():
with open("somefile", "a") as fh:
fh.write(part_of_data)
do_something()
with open("somefile", "a") as fh:
fh.write(other_data)
In this example, I want to make sure each other_data
is appended next to part_of_data
when tasks are executed in ThreadPoorExecuter
I'm not sure the with
statement is atomic operation, but if not, the executor should also guarantee the file is open and closed correctly
Here's my research after looking at the implementation of the ThreadPoolExecutor in the standard library:
The shutdown()
method is thread safe as it takes a lock before modifying shared state.
The submit()
method is thread safe in only some of its operations, but in practice it won't matter. When you submit a new function to be executed in the thread pool your function is placed in an instance of the thread safe queue.SimpleQueue
. Worker threads then block on this queue waiting to pop submitted functions off to execute them. Because the queue is thread safe the dispatching of submitted functions is thread safe which means that none of your submitted functions will be orphaned (not executed) or executed twice.
The part that is not thread safe is the internal _adjust_thread_count()
method. It could be called from two different threads at the same time, creating a race where both threads see that num_threads < self._max_workers
and both create new threads to fill up the thread pool. If this ever happens, though, it won't matter, as it just results in extra threads in the thread pool. That's hardly a problem to care about for most projects.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With