Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiprocessing output differs between Linux and Windows - Why?

I am attempting to pass a shared secret to child processes. In a Linux environment this works. In a Windows environment the child does not receive the shared secret. The three files below are a simple example of what I'm trying to do:

main.py

import multiprocessing
import module1
import module2

if __name__ == "__main__":
    module1.init()
    process = multiprocessing.Process(target=module2.start)
    process.start()
    process.join()

module1.py

import ctypes
import multiprocessing

x = None

def init():
    global x
    x = multiprocessing.Value(ctypes.c_wchar_p, "asdf")

module2.py

import module1

def start():
    print(module1.x.value)

In an Ubuntu 14.04 environment, on Python 3.5, I receive the following output:

$ python3 main.py
asdf

In a CentOS 7 environment, I receiving the following output:

$ python3 main.py
asdf

Using the Windows Subsystem for Linux on Windows 10 (both before and after the Creator Update, so Ubuntu 14.04 and 16.04) I get the following output:

$ python3 main.py
asdf

However, in both Windows 7 and Windows 10 environments, using either 3.5 or 3.6, I am getting an AttributeError instead of the above output:

Process Process-1:
Traceback (most recent call last):
  File "C:\Python\Python35\lib\multiprocessing\process.py", line 249, in _bootstrap
    self.run()
  File "C:\Python\Python35\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "H:\Development\replicate-bug\module2.py", line 5, in start
    print(module1.x.value)
AttributeError: 'NoneType' object has no attribute 'value'

I am using a shared ctype. This value should be inherited by the child process.

Why do I receive this AttributeError in a Windows environment, but not a Linux environment?

like image 214
Andy Avatar asked May 01 '17 03:05

Andy


People also ask

Does multiprocessing work in Windows?

The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.

How does Linux multiprocessing work?

Linux is a multiprocessing operating system, its objective is to have a process running on each CPU in the system at all times, to maximize CPU utilization. If there are more processes than CPUs (and there usually are), the rest of the processes must wait before a CPU becomes free until they can be run.

What is the difference between pool and process in multiprocessing?

Pool is generally used for heterogeneous tasks, whereas multiprocessing. Process is generally used for homogeneous tasks. The Pool is designed to execute heterogeneous tasks, that is tasks that do not resemble each other. For example, each task submitted to the process pool may be a different target function.

How does multiprocessing process work?

Multiprocessing is the ability of a system to run multiple processors at one time. If you had a computer with a single processor, it would switch between multiple processes to keep all of them running. However, most computers today have at least a multi-core processor, allowing several processes to be executed at once.


1 Answers

As mentioned in one of the posts automatically linked on the sidebar, windows does not have the fork systemcall present on *NIX systems.

This implies that instead of sharing global state (like NIX Processes can do), a Windows child process is basically completely separate. This includes modules.

What I suspect is happening is that the module gets loaded anew and the module1 you access inside module2.start isn't quite the module you expected.

The multiprocessing guidelines explicitly mention that module-level constants are exempt from the rule: "variables may not contain what you expect". Well in either case, the solution is to explicitly pass the module you want to the child process like so:

module 2

def start(mod1):
    print(mod1.x.value)

main.py

if __name__ == '__main__':
    module1.init()
    process = multiprocessing.Process(target=module2.start, args=(module1,))
    process.start()
    process.join()
like image 128
Vogel612 Avatar answered Oct 06 '22 20:10

Vogel612