Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Multiprocessing sharing of global values

What i am trying to do is to make use of global variable by each process. But my process is not taking the global values

import multiprocessing

count = 0 

def smile_detection(thread_name):
    global count

    for x in range(10):
        count +=1
        print thread_name,count

    return count    

x = multiprocessing.Process(target=smile_detection, args=("Thread1",))
y = multiprocessing.Process(target=smile_detection, args=("Thread2",))
x.start()
y.start()

I am getting output like

Thread1 1
Thread1 2
.
.
Thread1 9
Thread1 10
Thread2 1
Thread2 2
.
.
Thread2 9
Thread2 10

what I want is

Thread1 1
Thread1 2
.
.
Thread1 9
Thread1 10
Thread2 11
Thread2 12
.
.
Thread2 19
Thread2 20

What I have to do to achieve this?

like image 328
abhishek Avatar asked Jul 12 '16 07:07

abhishek


People also ask

Are global variables shared between processes Python?

Global variables can only be shared or inherited by child processes that are forked from the parent process. Specifically, this means that you must create child processes using the 'fork' start method.

Can global variables be shared between processes?

Global variables are still global within its own process. So the answer is no, global variables are not shared between processes after a call to fork().

How do you share a global variable in Python?

The canonical way to share information across modules within a single program is to create a special module (often called config or cfg). Import the config module in all modules of your application; the module then becomes available as a global name. In general, don't use from modulename import *.

What is the difference between pool and process in multiprocessing?

As we have seen, the Process allocates all the tasks in memory and Pool allocates only executing processes in memory, so when the task numbers is large, we can use Pool and when the task number is small, we can use Process class.

Are global variables shared between processes in multiprocessing?

While I was using multiprocessing, I found out that global variables are not shared between processes. Let me first provide an example of the issue that I was facing. I have 2 input lists, which 2 processes wil read from and append them to the final list and print the aggregated list to stdout

Can I share large arrays when using Python's multiprocessing?

HomeResearchEducationCoursesBlog On Sharing Large Arrays When Using Python's Multiprocessing Date: 3-7 2018 Tags: python, programming Recently, I was asked about sharing large numpy arrays when using Python's multiprocessing.Pool. While not explicitly documented, this is indeed possible. I will write about this small trick in this short article.

How do I share objects between processes in Python?

From Python’s Documentation: “The multiprocessing.Manager returns a started SyncManager object which can be used for sharing objects between processes. The returned manager object corresponds to a spawned child process and has methods which will create shared objects and return corresponding proxies.”

How to share objects between processes in a multiprocessing system?

We need to use multiprocessing.Manager.List. “The multiprocessing.Manager returns a started SyncManager object which can be used for sharing objects between processes. The returned manager object corresponds to a spawned child process and has methods which will create shared objects and return corresponding proxies.”


4 Answers

Unlike threading, multiprocessing is a bit trickier to handle shared state due to forking (or spawning) of a new process. Especially in windows. To have a shared object, use a multiprocessing.Array or multiprocessing.Value. In the case of the array, you can, in each process, dereference its memory address in another structure, e.g an numpy array. In your case, I would do something like this:

import multiprocessing, ctypes

count = multiprocessing.Value(ctypes.c_int, 0)  # (type, init value)

def smile_detection(thread_name, count):

    for x in range(10):
        count.value +=1
        print thread_name,count

    return count    

x = multiprocessing.Process(target=smile_detection, args=("Thread1", count))
y = multiprocessing.Process(target=smile_detection, args=("Thread2", count))
x.start()
y.start()
like image 74
alexpeits Avatar answered Oct 06 '22 01:10

alexpeits


You can use a multiprocessing.Value :

Return a ctypes object allocated from shared memory. By default the return value is actually a synchronized wrapper for the object.

The code would be like this:

import multiprocessing

count = multiprocessing.Value('i', 0)

def smile_detection(thread_name, count):
    for x in range(10):
        count += 1
        print thread_name, count

x = multiprocessing.Process(target=smile_detection, args=("Thread1",count))
y = multiprocessing.Process(target=smile_detection, args=("Thread2",count))

x.start()
y.start()
x.join()
y.join()

Be aware that the output will likely not be the one that you expect. In your expected output in fact, all the iterations of Thread 1 come before the ones of Thread 2. That's not the case in multi-threaded applications. If you want that to happen, well, you do not want it to be threaded!

like image 37
enrico.bacis Avatar answered Oct 05 '22 23:10

enrico.bacis


To share data between processes you to need to let mutiprocessing.Manager manage the shared data:

count = multiprocessing.Manager().Value('i', 0) # creating shared variable
lock = multiprocessing.Manager().Lock() # we'll use lock to acquire lock on `count` before count += 1

def smile_detection(thread_name):
    global count

    for x in range(10):
        lock.acquire()
        count +=1
        lock.release()
        print thread_name,count

    return count   
like image 25
Samuel Avatar answered Oct 06 '22 01:10

Samuel


Try doing it like this:

import multiprocessing

def smile_detection(thread_name, counter, lock):
    for x in range(10):
        with lock:
            counter.value +=1
            print thread_name, counter.value  


count = multiprocessing.Value('i',  0)
lock = multiprocessing.Lock()
x = multiprocessing.Process(target=smile_detection, args=("Thread1", count, lock))
y = multiprocessing.Process(target=smile_detection, args=("Thread2", count, lock))
x.start()
y.start()
x.join()
y.join()

First problem is that global variables are not shared between processes. You need to use a mechanism with some type of threadsafe locking or synchronization. We can use multiprocessing.Value('i', 0) to create a threadsafe, synchronized integer value. We use our multiprocessing.Lock() to ensure that only one thread can update the counter at a time.

If you really want to use the global variable, you can use multiprocessing.Manager(), which can stay in a global variable:

import multiprocessing

count = multiprocessing.Manager().Value('i',  0)
lock = multiprocessing.Manager().Lock()

def smile_detection(thread_name):
    global count, lock

    for x in range(10):
        with lock:
            counter.value +=1
            print thread_name, counter.value  

x = multiprocessing.Process(target=smile_detection, args=("Thread1",))
y = multiprocessing.Process(target=smile_detection, args=("Thread2",))
x.start()
y.start()
x.join()
y.join()

But, personally, I like the first method better, as a Manager() overcomplicates this.

Here's the output now:

$ python test.py
Thread1 1
Thread1 2
Thread1 3
Thread1 4
Thread1 5
Thread1 6
Thread1 7
Thread1 8
Thread1 9
...
Thread2 15
Thread2 16
Thread2 17
Thread2 18
Thread2 19
Thread2 20
like image 30
Will Avatar answered Oct 06 '22 01:10

Will