I have two pieces of code, representative of a more complex scenario I am trying to debug. I am wondering if they are technically equivalent, and if not, why.
First one:
import time
from concurrent.futures import ThreadPoolExecutor
def cb(res):
print("done", res)
def foo():
time.sleep(3)
res = 5
cb(res)
return res
with ThreadPoolExecutor(max_workers=2) as executor:
future = executor.submit(foo)
print(future.result())
Second one:
def cb2(fut):
print("done", fut.result())
def foo2():
time.sleep(3)
return 5
with ThreadPoolExecutor(max_workers=2) as executor:
future = executor.submit(foo2)
future.add_done_callback(cb2)
print(future.result())
The core of the issue is the following: I need to call a sync, slow operation (here, represented by the sleep). When that operation completes, I have to perform subsequent fast operations. In the first code, I put these operations immediately after the sync slow one. In the second code, I put it in the callback.
In terms of implementation, I suspect the future creates a secondary thread, runs the code in the secondary thread, and this secondary thread will stop at the sync slow operation. Once this operation is completed, the secondary thread will keep going, and it can keep going either by executing the subsequent code or by calling the callbacks. I see no difference in these two pieces of code (apart from the fact that adding the callback allows injecting code from outside, an added flexibility), but I might be wrong, hence the question.
Note that I do understand that in the first case, the print is called when the future is still not resolved and in the second one it is, but it is assumed that the status is not relevant.
These two examples are not equal in terms of events ordering. Let’s look through the lifecycle of a Future. It is roughly like that (reverse engineered from cpython’s source):
submit()
is called in that threadWhen you execute the statement print(future.result())
, your main thread blocks and becomes the future’s waiter. It becomes unblocked right after the future switches to FINISHED, but right before callbacks start to execute. That means that you cannot predict what print goes first in the console - print
in any of your callbacks, or print(future(result))
- they now are executing in parallel. If you deal with same data in your callbacks and in the main thread after waiting for future.result()
to complete, you are likely to get data corruption.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With