Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

changing global variable when multiprocessing in python

So what I am trying to do ultimately is read a line, do some calculations with the info in that line, then add the result to some global object, but I can never seem to get it to work. For instance, test is always 0 in the code below. I know this is wrong, and I have tried doing it other ways, but it still isn't working.

import multiprocessing as mp

File = 'HGDP_FinalReport_Forward.txt'
#short_file = open(File)
test = 0

def pro(temp_line):
    global test
    temp_line = temp_line.strip().split()
    test = test + 1
    return len(temp_line)

if __name__ == "__main__":
    with open("HGDP_FinalReport_Forward.txt") as lines:
        pool = mp.Pool(processes = 10)
        t = pool.map(pro,lines.readlines())
like image 495
user1423020 Avatar asked Jun 19 '12 21:06

user1423020


People also ask

Can Python multiprocessing access global variables?

Multi-process Cannot Share Global Variables | Teach You to Get Started With Python One Hundred and Six.

Can global variables be changed in Python?

In Python, global keyword allows you to modify the variable outside of the current scope. It is used to create a global variable and make changes to the variable in a local context.

Are global variables shared between processes?

Global variables can only be shared or inherited by child processes that are forked from the parent process.

Why should global variables be avoided in Python?

While in many or most other programming languages variables are treated as global if not declared otherwise, Python deals with variables the other way around. They are local, if not otherwise declared. The driving reason behind this approach is that global variables are generally bad practice and should be avoided.


1 Answers

The worker processes spawned by the pool get their own copy of the global variable and update that. They don't share memory unless you set that up explicitly. The easiest solution is to communicate the final value of test back to the main process, e.g. via the return value. Something like (untested):

def pro(temp_line):
    test = 0
    temp_line = temp_line.strip().split()
    test = test + 1
    return test, len(temp_line)

if __name__ == "__main__":
    with open("somefile.txt") as lines:
        pool = mp.Pool(processes = 10)
        tests_and_t = pool.map(pro,lines.readlines())
        tests, t = zip(*test_and_t)
        test = sum(tests)
like image 76
Fred Foo Avatar answered Sep 28 '22 21:09

Fred Foo