Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

multiprocessing global variable updates not returned to parent

I am trying to return values from subprocesses but these values are unfortunately unpicklable. So I used global variables in threads module with success but have not been able to retrieve updates done in subprocesses when using multiprocessing module. I hope I'm missing something.

The results printed at the end are always the same as initial values given the vars dataDV03 and dataDV04. The subprocesses are updating these global variables but these global variables remain unchanged in the parent.

import multiprocessing  # NOT ABLE to get python to return values in passed variables.  ants = ['DV03', 'DV04'] dataDV03 = ['', ''] dataDV04 = {'driver': '', 'status': ''}   def getDV03CclDrivers(lib):  # call global variable     global dataDV03     dataDV03[1] = 1     dataDV03[0] = 0  # eval( 'CCL.' + lib + '.' +  lib + '( "DV03" )' ) these are unpicklable instantiations  def getDV04CclDrivers(lib, dataDV04):   # pass global variable     dataDV04['driver'] = 0  # eval( 'CCL.' + lib + '.' +  lib + '( "DV04" )' )   if __name__ == "__main__":      jobs = []     if 'DV03' in ants:         j = multiprocessing.Process(target=getDV03CclDrivers, args=('LORR',))         jobs.append(j)      if 'DV04' in ants:         j = multiprocessing.Process(target=getDV04CclDrivers, args=('LORR', dataDV04))         jobs.append(j)      for j in jobs:         j.start()      for j in jobs:         j.join()      print 'Results:\n'     print 'DV03', dataDV03     print 'DV04', dataDV04 

I cannot post to my question so will try to edit the original.

Here is the object that is not picklable:

In [1]: from CCL import LORR In [2]: lorr=LORR.LORR('DV20', None) In [3]: lorr Out[3]: <CCL.LORR.LORR instance at 0x94b188c> 

This is the error returned when I use a multiprocessing.Pool to return the instance back to the parent:

Thread getCcl (('DV20', 'LORR'),) Process PoolWorker-1: Traceback (most recent call last): File "/alma/ACS-10.1/casa/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap self.run() File "/alma/ACS-10.1/casa/lib/python2.6/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "/alma/ACS-10.1/casa/lib/python2.6/multiprocessing/pool.py", line 71, in worker put((job, i, result)) File "/alma/ACS-10.1/casa/lib/python2.6/multiprocessing/queues.py", line 366, in put return send(obj) UnpickleableError: Cannot pickle <type 'thread.lock'> objects 
In [5]: dir(lorr) Out[5]: ['GET_AMBIENT_TEMPERATURE',  'GET_CAN_ERROR',  'GET_CAN_ERROR_COUNT',  'GET_CHANNEL_NUMBER',  'GET_COUNT_PER_C_OP',  'GET_COUNT_REMAINING_OP',  'GET_DCM_LOCKED',  'GET_EFC_125_MHZ',  'GET_EFC_COMB_LINE_PLL',  'GET_ERROR_CODE_LAST_CAN_ERROR',  'GET_INTERNAL_SLAVE_ERROR_CODE',  'GET_MAGNITUDE_CELSIUS_OP',  'GET_MAJOR_REV_LEVEL',  'GET_MINOR_REV_LEVEL',  'GET_MODULE_CODES_CDAY',  'GET_MODULE_CODES_CMONTH',  'GET_MODULE_CODES_DIG1',  'GET_MODULE_CODES_DIG2',  'GET_MODULE_CODES_DIG4',  'GET_MODULE_CODES_DIG6',  'GET_MODULE_CODES_SERIAL',  'GET_MODULE_CODES_VERSION_MAJOR',  'GET_MODULE_CODES_VERSION_MINOR',  'GET_MODULE_CODES_YEAR',  'GET_NODE_ADDRESS',  'GET_OPTICAL_POWER_OFF',  'GET_OUTPUT_125MHZ_LOCKED',  'GET_OUTPUT_2GHZ_LOCKED',  'GET_PATCH_LEVEL',  'GET_POWER_SUPPLY_12V_NOT_OK',  'GET_POWER_SUPPLY_15V_NOT_OK',  'GET_PROTOCOL_MAJOR_REV_LEVEL',  'GET_PROTOCOL_MINOR_REV_LEVEL',  'GET_PROTOCOL_PATCH_LEVEL',  'GET_PROTOCOL_REV_LEVEL',  'GET_PWR_125_MHZ',  'GET_PWR_25_MHZ',  'GET_PWR_2_GHZ',  'GET_READ_MODULE_CODES',  'GET_RX_OPT_PWR',  'GET_SERIAL_NUMBER',  'GET_SIGN_OP',  'GET_STATUS',  'GET_SW_REV_LEVEL',  'GET_TE_LENGTH',  'GET_TE_LONG_FLAG_SET',  'GET_TE_OFFSET_COUNTER',  'GET_TE_SHORT_FLAG_SET',  'GET_TRANS_NUM',  'GET_VDC_12',  'GET_VDC_15',  'GET_VDC_7',  'GET_VDC_MINUS_7',  'SET_CLEAR_FLAGS',  'SET_FPGA_LOGIC_RESET',  'SET_RESET_AMBSI',  'SET_RESET_DEVICE',  'SET_RESYNC_TE',  'STATUS',  '_HardwareDevice__componentName',  '_HardwareDevice__hw',  '_HardwareDevice__stickyFlag',  '_LORRBase__logger',  '__del__',  '__doc__',  '__init__',  '__module__',  '_devices',  'clearDeviceCommunicationErrorAlarm',  'getControlList',  'getDeviceCommunicationErrorCounter',  'getErrorMessage',  'getHwState',  'getInternalSlaveCanErrorMsg',  'getLastCanErrorMsg',  'getMonitorList',  'hwConfigure',  'hwDiagnostic',  'hwInitialize',  'hwOperational',  'hwSimulation',  'hwStart',  'hwStop',  'inErrorState',  'isMonitoring',  'isSimulated']  In [6]: 
like image 611
Buoy Avatar asked Jun 15 '12 17:06

Buoy


2 Answers

When you use multiprocessing to open a second process, an entirely new instance of Python, with its own global state, is created. That global state is not shared, so changes made by child processes to global variables will be invisible to the parent process.

Additionally, most of the abstractions that multiprocessing provides use pickle to transfer data. All data transferred using proxies must be pickleable; that includes all the objects that a Manager provides. Relevant quotations (my emphasis):

Ensure that the arguments to the methods of proxies are picklable.

And (in the Manager section):

Other processes can access the shared objects by using proxies.

Queues also require pickleable data; the docs don't say so, but a quick test confirms it:

import multiprocessing import pickle  class Thing(object):     def __getstate__(self):         print 'got pickled'         return self.__dict__     def __setstate__(self, state):         print 'got unpickled'         self.__dict__.update(state)  q = multiprocessing.Queue() p = multiprocessing.Process(target=q.put, args=(Thing(),)) p.start() print q.get() p.join() 

Output:

$ python mp.py  got pickled got unpickled <__main__.Thing object at 0x10056b350> 

The one approach that might work for you, if you really can't pickle the data, is to find a way to store it as a ctype object; a reference to the memory can then be passed to a child process. This seems pretty dodgy to me; I've never done it. But it might be a possible solution for you.

Given your update, it seems like you need to know a lot more about the internals of a LORR. Is LORR a class? Can you subclass from it? Is it a subclass of something else? What's its MRO? (Try LORR.__mro__ and post the output if it works.) If it's a pure python object, it might be possible to subclass it, creating a __setstate__ and a __getstate__ to enable pickling.

Another approach might be to figure out how to get the relevant data out of a LORR instance and pass it via a simple string. Since you say that you really just want to call the methods of the object, why not just do so using Queues to send messages back and forth? In other words, something like this (schematically):

Main Process              Child 1                       Child 2                           LORR 1                        LORR 2  child1_in_queue     ->    get message 'foo'                           call 'foo' method child1_out_queue    <-    return foo data string child2_in_queue                   ->                    get message 'bar'                                                         call 'bar' method child2_out_queue                  <-                    return bar data string 
like image 178
senderle Avatar answered Sep 25 '22 04:09

senderle


@DBlas gives you a quick url and reference to the Manager class in an answer, but I think its still a bit vague so I thought it might be helpful for you to just see it applied...

import multiprocessing from multiprocessing import Manager  ants = ['DV03', 'DV04']  def getDV03CclDrivers(lib, data_dict):       data_dict[1] = 1     data_dict[0] = 0  def getDV04CclDrivers(lib, data_list):        data_list['driver'] = 0     if __name__ == "__main__":      manager = Manager()     dataDV03 = manager.list(['', ''])     dataDV04 = manager.dict({'driver': '', 'status': ''})      jobs = []     if 'DV03' in ants:         j = multiprocessing.Process(                 target=getDV03CclDrivers,                  args=('LORR', dataDV03))         jobs.append(j)      if 'DV04' in ants:         j = multiprocessing.Process(                 target=getDV04CclDrivers,                  args=('LORR', dataDV04))         jobs.append(j)      for j in jobs:         j.start()      for j in jobs:         j.join()      print 'Results:\n'     print 'DV03', dataDV03     print 'DV04', dataDV04 

Because multiprocessing actually uses separate processes, you cannot simply share global variables because they will be in completely different "spaces" in memory. What you do to a global under one process will not reflect in another. Though I admit that it seems confusing since the way you see it, its all living right there in the same piece of code, so "why shouldn't those methods have access to the global"? Its harder to wrap your head around the idea that they will be running in different processes.

The Manager class is given to act as a proxy for data structures that can shuttle info back and forth for you between processes. What you will do is create a special dict and list from a manager, pass them into your methods, and operate on them locally.

Un-pickle-able data

For your specialize LORR object, you might need to create something like a proxy that can represent the pickable state of the instance.

Not super robust or tested much, but gives you the idea.

class LORRProxy(object):      def __init__(self, lorrObject=None):         self.instance = lorrObject      def __getstate__(self):         # how to get the state data out of a lorr instance         inst = self.instance         state = dict(             foo = inst.a,             bar = inst.b,         )         return state      def __setstate__(self, state):         # rebuilt a lorr instance from state         lorr = LORR.LORR()         lorr.a = state['foo']         lorr.b = state['bar']         self.instance = lorr 
like image 35
jdi Avatar answered Sep 24 '22 04:09

jdi