Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is data safety guaranteed while using `ThreadPoolExecutor` from python's `future` module?

Tags:

python

I'm looking for a conceptual answer on this question.

I'm wondering whether using ThreadPool in python to perform concurrent tasks, guarantees that data is not corrupted; I mean multiple threads don't access the critical data at the same time.

If so, how does this ThreadPoolExecutor internally works to ensure that critical data is accessed by only one thread at a time?

like image 477
Shubham Sharma Avatar asked Feb 17 '20 07:02

Shubham Sharma


2 Answers

Thread pools do not guarantee that shared data is not corrupted. Threads can swap at any byte code execution boundary and corruption is always a risk. Shared data should be protected by synchronization resources such as locks, condition variables and events. See the threading module docs

concurrent.futures.ThreadPoolExecutor is a thread pool specialized to the concurrent.futures async task model. But all of the risks of traditional threading are still there.

If you are using the python async model, things that fiddle with shared data should be dispatched on the main thread. The thread pool should be used for autonomous events, especially those that wait on blocking I/O.

like image 108
tdelaney Avatar answered Sep 30 '22 15:09

tdelaney


If so, how does this ThreadPoolExecutor internally works to ensure that critical data is accessed by only one thread at a time?

It doesn't, that's your job.

The high-level methods like map will use a safe work queue and not share work items between threads, but if you've got other resources which can be shared then the pool does not know or care, it's your problem as the developer.

like image 34
Masklinn Avatar answered Sep 30 '22 17:09

Masklinn