Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Odd threading behavior in python

I have a problem where I need to pass the index of an array to a function which I define inline. The function then gets passed as a parameter to another function which will eventually call it as a callback.

The thing is, when the code gets called, the value of the index is all wrong. I eventually solved this by creating an ugly workaround but I am interested in understanding what is happening here. I created a minimal example to demonstrate the problem:

from __future__ import print_function
import threading


def works_as_expected():
    for i in range(10):
        run_in_thread(lambda: print('the number is: {}'.format(i)))

def not_as_expected():
    for i in range(10):
        run_later_in_thread(lambda: print('the number is: {}'.format(i)))

def run_in_thread(f):
    threading.Thread(target=f).start()

threads_to_run_later = []
def run_later_in_thread(f):
    threads_to_run_later.append(threading.Thread(target=f))


print('this works as expected:\n')
works_as_expected()

print('\nthis does not work as expected:\n')
not_as_expected()
for t in threads_to_run_later: t.start()

Here is the output:

this works as expected:

the number is: 0
the number is: 1
the number is: 2
the number is: 3
the number is: 4
the number is: 6
the number is: 7
the number is: 7
the number is: 8
the number is: 9

this does not work as expected:

the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9

Can someone explain what is happening here? I assume it has to do with enclosing scope or something, but an answer with a reference that explains this dark (to me) corner of python scoping would be valuable to me.

I'm running this on python 2.7.11

like image 471
Stephen Avatar asked Feb 03 '16 03:02

Stephen


1 Answers

This is a result of how closures and scopes work in python.

What is happening is that i is bound within the scope of the not_as_expected function. So even though you're feeding a lambda function to the thread, the variable it's using is being shared between each lambda and each thread.

Consider this example:

def make_function():
    i = 1
    def inside_function():
        print i
    i = 2
    return inside_function

f = make_function()
f()

What number do you think it will print? The i = 1 before the function was defined or the i = 2 after?

It's going to print the current value of i (i.e. 2). It doesn't matter what the value of i was when the function was made, it's always going to use the current value. The same thing is happening with your lambda functions.

Even in your expected results you can see it didn't always work right, it skipped 5 and displayed 7 twice. What is happening in that case is that each lambda is usually running before the loop gets to the next iteration. But in some cases (like the 5) the loop manages to get through two iterations before control is passed to one of the other threads, and i increments twice and a number is skipped. In other cases (like the 7) two threads manage to run while the loop is still in the same iteration and since i doesn't change between the two threads, the same value gets printed.

If you instead did this:

def function_maker(i):
    return lambda: print('the number is: {}'.format(i))

def not_as_expected():
    for i in range(10):
        run_later_in_thread(function_maker(i))

The i variable gets bound inside function_maker along with the lambda function. Each lambda function will be referencing a different variable, and it will work as expected.

like image 132
Brendan Abel Avatar answered Sep 23 '22 01:09

Brendan Abel