Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using multiprocessing in Python, what is the correct approach for import statements?

PEP 8 states:

Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.

However if the class/method/function that I am importing is only used by a child process, surely it is more efficient to do the import when it is needed? My code is basically:

p = multiprocessing.Process(target=main,args=(dump_file,))
p.start()
p.join()
print u"Process ended with exitcode: {}".format(p.exitcode)
if os.path.getsize(dump_file) > 0:
    blc = BugLogClient(listener='http://21.18.25.06:8888/bugLog/listeners/bugLogListenerREST.cfm',appName='main')
    blc.notifyCrash(dump_file)

main() is the main application. This functions needs a lot of imports to run and those take up some ram space (+/- 35MB). As the application runs in another process, the imports were being done twice following PEP 8 (once by the parent process and another one by the child process). It should also be noted that this function should only be called once as the parent process is waiting to see if the application crashed and left an exitcode (thanks to faulthandler). So I coded the imports inside the main function like this:

def main(dump_file):

    import shutil
    import locale

    import faulthandler

    from PySide.QtCore import Qt
    from PySide.QtGui import QApplication, QIcon

instead of:

import shutil
import locale

import faulthandler

from PySide.QtCore import Qt
from PySide.QtGui import QApplication, QIcon

def main(dump_file):

Is there an 'standard' way to handle imports done using multiprocessing?

PS: I´ve seen this sister question

like image 866
Andrés Marafioti Avatar asked Jan 05 '16 19:01

Andrés Marafioti


People also ask

What does import multiprocessing do in Python?

Python multiprocessing Process class is an abstraction that sets up another Python process, provides it to run code and a way for the parent application to control execution.

How does multiprocessing process work in Python?

The multiprocessing package supports spawning processes. It refers to a function that loads and executes a new child processes. For the child to terminate or to continue executing concurrent computing,then the current process hasto wait using an API, which is similar to threading module.

What are the three types of import statement in Python explain?

There are generally three groups: standard library imports (Python's built-in modules) related third party imports (modules that are installed and do not belong to the current application) local application imports (modules that belong to the current application)

Can we do multiprocessing in Python?

This is what gives multiprocessing an upper hand over threading in Python. Multiple processes can be run in parallel because each process has its own interpreter that executes the instructions allocated to it.


1 Answers

The 'standard' way is the one reported by PEP 8. This is what PEP 8 serves for: a reference guide for coding in Python.

There are always exception though. This case is one of them.

As Windows does not clone the parent's process memory, when a child process is spawned the child process must re-import all the modules. Linux handles processes in a more optimal way avoiding issues like this.

I'm not familiar with Windows memory management but I'd say the modules are shared and not loaded twice. What you probably see is the Virtual Memory of the two processes and not the physical one. On physical memory, only one copy of the modules should be loaded.

It is up to you whether to follow PEP 8 or not. When resources are a constrain the code needs to adapt. But do not over-optimize the code if not necessary! That's a wrong approach.

like image 116
noxdafox Avatar answered Oct 11 '22 01:10

noxdafox