why using multiple threading to get the sum is correct?

Tags:

python

my code is

import threading

counter = 0

def worker():
    global counter
    counter += 1

if __name__ == "__main__":
    threads = []
    for i in range(1000):
        t = threading.Thread(target = worker)
        threads.append(t)
        t.start()
    for t in threads:
        t.join()

    print counter

because I don't use lock to protect the shared resource,i.e. counter variable, I expect that the result is a number less than 1000, but the counter is always 1000, I don't know why. Does counter += 1 is a atomic operation in Python?

what operations in Python are atomic using GIL?

555

asked Feb 25 '13 01:02

remykits

1 Answers

Don't count on x += 1 being thread-safe. Here is an example where it does not work (see Josiah Carlson's comment):

import threading
x = 0
def foo():
    global x
    for i in xrange(1000000):
        x += 1
threads = [threading.Thread(target=foo), threading.Thread(target=foo)]
for t in threads:
    t.daemon = True
    t.start()
for t in threads:
    t.join()
print(x)

If you disassemble foo:

In [80]: import dis

In [81]: dis.dis(foo)
  4           0 SETUP_LOOP              30 (to 33)
              3 LOAD_GLOBAL              0 (xrange)
              6 LOAD_CONST               1 (1000000)
              9 CALL_FUNCTION            1
             12 GET_ITER            
        >>   13 FOR_ITER                16 (to 32)
             16 STORE_FAST               0 (i)

  5          19 LOAD_GLOBAL              1 (x)
             22 LOAD_CONST               2 (1)
             25 INPLACE_ADD         
             26 STORE_GLOBAL             1 (x)
             29 JUMP_ABSOLUTE           13
        >>   32 POP_BLOCK           
        >>   33 LOAD_CONST               0 (None)
             36 RETURN_VALUE

You see that there is a LOAD_GLOBAL to retrieve the value of x, there is an INPLACE_ADD, and then a STORE_GLOBAL.

If both threads LOAD_GLOBAL in succession, then they might both load the same value of x. Then they both increment to the same number, and store the same number. So the work of one thread overwrites the work of the other. This is not thread-safe.

As you can see, the final value of x would be 2000000 if the program were thread-safe, but instead you almost always get a number less than 2000000.

If you add a lock, you get the "expected" answer:

import threading
lock = threading.Lock()
x = 0
def foo():
    global x
    for i in xrange(1000000):
        with lock:
            x += 1
threads = [threading.Thread(target=foo), threading.Thread(target=foo)]
for t in threads:
    t.daemon = True
    t.start()
for t in threads:
    t.join()
print(x)

yields

I think the reason why the code you posted does not exhibit a problem:

for i in range(1000):
    t = threading.Thread(target = worker)
    threads.append(t)
    t.start()

is because your workers complete so darn quickly compared to the time it takes to spawn a new thread that in practice there is no competition between threads. In Josiah Carlson's example above, each thread spends a significant amount of time in foo which increases the chance of thread collision.

124

answered Oct 17 '22 22:10

unutbu

Related questions
                            
                                Confused by lexical closure in list comprehension
                            
                                BeautifulSoup: Extracting Value from Children nodes
                            
                                Running a standalone script doing a model query in Django with `settings/dev.py` instead of `settings.py`
                            
                                Selenium not deleting profiles on browser close
                            
                                Finding least common elements in a list
                            
                                string.decode custom errors argument
                            
                                Python - Transfer a file between two remote servers, excecuting a python script
                            
                                Calculating the complexity of Levenshtein Edit Distance
                            
                                Resizing png image with PIL loses transparency
                            
                                Generate an SQL statement to insert multiple lines into a MySQL database at once using Python
                            
                                Python and truly concurrent threads
                            
                                Blinking an LED with an Arduino and pySerial
                            
                                quit mainloop in python
                            
                                Matplotlib: How to remove the vertical space when displaying circles on a grid?
                            
                                improve nested loop performance
                            
                                How to show help for all subparsers in argparse?
                            
                                Making AES decryption fail if invalid password
                            
                                How can I make the bullet appear directly next to the text of an indented list in the reportlab package for python?
                            
                                Write value to hidden element with selenium python script
                            
                                Assigning functions as attributes of an object, then calling without the implied 'self' arguement?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With