What i am trying to do is to make use of global variable by each process. But my process is not taking the global values
import multiprocessing
count = 0
def smile_detection(thread_name):
global count
for x in range(10):
count +=1
print thread_name,count
return count
x = multiprocessing.Process(target=smile_detection, args=("Thread1",))
y = multiprocessing.Process(target=smile_detection, args=("Thread2",))
x.start()
y.start()
I am getting output like
Thread1 1
Thread1 2
.
.
Thread1 9
Thread1 10
Thread2 1
Thread2 2
.
.
Thread2 9
Thread2 10
what I want is
Thread1 1
Thread1 2
.
.
Thread1 9
Thread1 10
Thread2 11
Thread2 12
.
.
Thread2 19
Thread2 20
What I have to do to achieve this?
Global variables can only be shared or inherited by child processes that are forked from the parent process. Specifically, this means that you must create child processes using the 'fork' start method.
Global variables are still global within its own process. So the answer is no, global variables are not shared between processes after a call to fork().
The canonical way to share information across modules within a single program is to create a special module (often called config or cfg). Import the config module in all modules of your application; the module then becomes available as a global name. In general, don't use from modulename import *.
As we have seen, the Process allocates all the tasks in memory and Pool allocates only executing processes in memory, so when the task numbers is large, we can use Pool and when the task number is small, we can use Process class.
While I was using multiprocessing, I found out that global variables are not shared between processes. Let me first provide an example of the issue that I was facing. I have 2 input lists, which 2 processes wil read from and append them to the final list and print the aggregated list to stdout
HomeResearchEducationCoursesBlog On Sharing Large Arrays When Using Python's Multiprocessing Date: 3-7 2018 Tags: python, programming Recently, I was asked about sharing large numpy arrays when using Python's multiprocessing.Pool. While not explicitly documented, this is indeed possible. I will write about this small trick in this short article.
From Python’s Documentation: “The multiprocessing.Manager returns a started SyncManager object which can be used for sharing objects between processes. The returned manager object corresponds to a spawned child process and has methods which will create shared objects and return corresponding proxies.”
We need to use multiprocessing.Manager.List. “The multiprocessing.Manager returns a started SyncManager object which can be used for sharing objects between processes. The returned manager object corresponds to a spawned child process and has methods which will create shared objects and return corresponding proxies.”
Unlike threading, multiprocessing is a bit trickier to handle shared state due to forking (or spawning) of a new process. Especially in windows. To have a shared object, use a multiprocessing.Array or multiprocessing.Value. In the case of the array, you can, in each process, dereference its memory address in another structure, e.g an numpy array. In your case, I would do something like this:
import multiprocessing, ctypes
count = multiprocessing.Value(ctypes.c_int, 0) # (type, init value)
def smile_detection(thread_name, count):
for x in range(10):
count.value +=1
print thread_name,count
return count
x = multiprocessing.Process(target=smile_detection, args=("Thread1", count))
y = multiprocessing.Process(target=smile_detection, args=("Thread2", count))
x.start()
y.start()
You can use a multiprocessing.Value
:
Return a ctypes object allocated from shared memory. By default the return value is actually a synchronized wrapper for the object.
The code would be like this:
import multiprocessing
count = multiprocessing.Value('i', 0)
def smile_detection(thread_name, count):
for x in range(10):
count += 1
print thread_name, count
x = multiprocessing.Process(target=smile_detection, args=("Thread1",count))
y = multiprocessing.Process(target=smile_detection, args=("Thread2",count))
x.start()
y.start()
x.join()
y.join()
Be aware that the output will likely not be the one that you expect. In your expected output in fact, all the iterations of Thread 1
come before the ones of Thread 2
. That's not the case in multi-threaded applications. If you want that to happen, well, you do not want it to be threaded!
To share data between processes you to need to let mutiprocessing.Manager
manage the shared data:
count = multiprocessing.Manager().Value('i', 0) # creating shared variable
lock = multiprocessing.Manager().Lock() # we'll use lock to acquire lock on `count` before count += 1
def smile_detection(thread_name):
global count
for x in range(10):
lock.acquire()
count +=1
lock.release()
print thread_name,count
return count
Try doing it like this:
import multiprocessing
def smile_detection(thread_name, counter, lock):
for x in range(10):
with lock:
counter.value +=1
print thread_name, counter.value
count = multiprocessing.Value('i', 0)
lock = multiprocessing.Lock()
x = multiprocessing.Process(target=smile_detection, args=("Thread1", count, lock))
y = multiprocessing.Process(target=smile_detection, args=("Thread2", count, lock))
x.start()
y.start()
x.join()
y.join()
First problem is that global variables are not shared between processes. You need to use a mechanism with some type of threadsafe locking or synchronization. We can use multiprocessing.Value('i', 0)
to create a threadsafe, synchronized integer value. We use our multiprocessing.Lock()
to ensure that only one thread can update the counter at a time.
If you really want to use the global variable, you can use multiprocessing.Manager()
, which can stay in a global variable:
import multiprocessing
count = multiprocessing.Manager().Value('i', 0)
lock = multiprocessing.Manager().Lock()
def smile_detection(thread_name):
global count, lock
for x in range(10):
with lock:
counter.value +=1
print thread_name, counter.value
x = multiprocessing.Process(target=smile_detection, args=("Thread1",))
y = multiprocessing.Process(target=smile_detection, args=("Thread2",))
x.start()
y.start()
x.join()
y.join()
But, personally, I like the first method better, as a Manager()
overcomplicates this.
Here's the output now:
$ python test.py
Thread1 1
Thread1 2
Thread1 3
Thread1 4
Thread1 5
Thread1 6
Thread1 7
Thread1 8
Thread1 9
...
Thread2 15
Thread2 16
Thread2 17
Thread2 18
Thread2 19
Thread2 20
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With