The following does not work one.py <pre class="prettyprint"><code>import shared shared.value = 'Hello' raw_input('A cheap way to keep process alive..') </code></pre> two.py <pre class="prettyprint"><code>import shared print shared.value </code></pre> run on two command lines as: <pre class="prettyprint"><code>>>python one.py >>python two.py </code></pre> (the second one gets an attribute error, rightly so). Is there a way to accomplish this, that is, share a variable between two scripts?

Hope it's OK to jot down my notes about this issue here. First of all, I appreciate the example in the OP a lot, because that is where I started as well - although it made me think <code>shared</code> is some built-in Python module, until I found a complete example at [Tutor] Global Variables between Modules ??. However, when I looked for "sharing variables between scripts" (or processes) - besides the case when a Python script needs to use variables defined in other Python source files (but not necessarily running processes) - I mostly stumbled upon two other use cases: <ul> <li>A script forks itself into multiple child processes, which then run in parallel (possibly on multiple processors) on the same PC</li> <li>A script spawns multiple other child processes, which then run in parallel (possibly on multiple processors) on the same PC</li> </ul> As such, most hits regarding "shared variables" and "interprocess communication" (IPC) discuss cases like these two; however, in both of these cases one can observe a "parent", to which the "children" usually have a reference. What I am interested in, however, is running multiple invocations of the same script, ran independently, and sharing data between those (as in Python: how to share an object instance across multiple invocations of a script), in a singleton/single instance mode. That kind of problem is not really addressed by the above two cases - instead, it essentially reduces to the example in OP (sharing variables across two scripts). Now, when dealing with this problem in Perl, there is IPC::Shareable; which "allows you to tie a variable to shared memory", using "an integer number or 4 character string[1] that serves as a common identifier for data across process space". Thus, there are no temporary files, nor networking setups - which I find great for my use case; so I was looking for the same in Python. However, as accepted answer by @Drewfer notes: "You're not going to be able to do what you want without storing the information somewhere external to the two instances of the interpreter"; or in other words: either you have to use a networking/socket setup - or you have to use temporary files (ergo, no shared RAM for "totally separate python sessions"). Now, even with these considerations, it is kinda difficult to find working examples (except for <code>pickle</code>) - also in the docs for mmap and multiprocessing. I have managed to find some other examples - which also describe some pitfalls that the docs do not mention: <ul> <li>Usage of <code>mmap</code>: working code in two different scripts at Sharing Python data between processes using mmap | schmichael's blog <ul> <li>Demonstrates how both scripts change the shared value</li> <li>Note that here a temporary file is created as storage for saved data - <code>mmap</code> is just a special interface for accessing this temporary file</li> </ul> </li> <li>Usage of <code>multiprocessing</code>: working code at: <ul> <li> Python multiprocessing RemoteManager under a multiprocessing.Process - working example of <code>SyncManager</code> (via <code>manager.start()</code>) with shared <code>Queue</code>; server(s) writes, clients read (shared data)</li> <li> Comparison of the multiprocessing module and pyro? - working example of <code>BaseManager</code> (via <code>server.serve_forever()</code>) with shared custom class; server writes, client reads and writes</li> <li> How to synchronize a python dict with multiprocessing - this answer has a great explanation of <code>multiprocessing</code> pitfalls, and is a working example of <code>SyncManager</code> (via <code>manager.start()</code>) with shared dict; server does nothing, client reads and writes</li> </ul> </li> </ul> Thanks to these examples, I came up with an example, which essentially does the same as the <code>mmap</code> example, with approaches from the "synchronize a python dict" example - using <code>BaseManager</code> (via <code>manager.start()</code> through file path address) with shared list; both server and client read and write (pasted below). Note that: <ul> <li> <code>multiprocessing</code> managers can be started either via <code>manager.start()</code> or <code>server.serve_forever()</code> <ul> <li> <code>serve_forever()</code> locks - <code>start()</code> doesn't </li> <li>There is auto-logging facility in <code>multiprocessing</code>: it seems to work fine with <code>start()</code>ed processes - but seems to ignore the ones that <code>serve_forever()</code> </li> </ul> </li> <li>The address specification in <code>multiprocessing</code> can be IP (socket) or temporary file (possibly a pipe?) path; in <code>multiprocessing</code> docs: <ul> <li>Most examples use <code>multiprocessing.Manager()</code> - this is just a function (not class instantiation) which returns a <code>SyncManager</code>, which is a special subclass of <code>BaseManager</code>; and uses <code>start()</code> - but not for IPC between independently ran scripts; here a file path is used</li> <li>Few other examples <code>serve_forever()</code> approach for IPC between independently ran scripts; here IP/socket address is used</li> <li>If an address is not specified, then an temp file path is used automatically (see 16.6.2.12. Logging for an example of how to see this)</li> </ul> </li> </ul> In addition to all the pitfalls in the "synchronize a python dict" post, there are additional ones in case of a list. That post notes: <blockquote> All manipulations of the dict must be done with methods and not dict assignments (syncdict["blast"] = 2 will fail miserably because of the way multiprocessing shares custom objects) </blockquote> The workaround to <code>dict['key']</code> getting and setting, is the use of the <code>dict</code> public methods <code>get</code> and <code>update</code>. The problem is that there are no such public methods as alternative for <code>list[index]</code>; thus, for a shared list, in addition we have to register <code>__getitem__</code> and <code>__setitem__</code> methods (which are private for <code>list</code>) as <code>exposed</code>, which means we also have to re-register all the public methods for <code>list</code> as well <code>:/</code> Well, I think those were the most critical things; these are the two scripts - they can just be ran in separate terminals (server first); note developed on Linux with Python 2.7: <code>a.py</code> (server): <pre class="prettyprint"><code>import multiprocessing import multiprocessing.managers import logging logger = multiprocessing.log_to_stderr() logger.setLevel(logging.INFO) class MyListManager(multiprocessing.managers.BaseManager): pass syncarr = [] def get_arr(): return syncarr def main(): # print dir([]) # cannot do `exposed = dir([])`!! manually: MyListManager.register("syncarr", get_arr, exposed=['__getitem__', '__setitem__', '__str__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']) manager = MyListManager(address=('/tmp/mypipe'), authkey='') manager.start() # we don't use the same name as `syncarr` here (although we could); # just to see that `syncarr_tmp` is actually <AutoProxy[syncarr] object> # so we also have to expose `__str__` method in order to print its list values! syncarr_tmp = manager.syncarr() print("syncarr (master):", syncarr, "syncarr_tmp:", syncarr_tmp) print("syncarr initial:", syncarr_tmp.__str__()) syncarr_tmp.append(140) syncarr_tmp.append("hello") print("syncarr set:", str(syncarr_tmp)) raw_input('Now run b.py and press ENTER') print print 'Changing [0]' syncarr_tmp.__setitem__(0, 250) print 'Changing [1]' syncarr_tmp.__setitem__(1, "foo") new_i = raw_input('Enter a new int value for [0]: ') syncarr_tmp.__setitem__(0, int(new_i)) raw_input("Press any key (NOT Ctrl-C!) to kill server (but kill client first)".center(50, "-")) manager.shutdown() if __name__ == '__main__': main() </code></pre> <code>b.py</code> (client) <pre class="prettyprint"><code>import time import multiprocessing import multiprocessing.managers import logging logger = multiprocessing.log_to_stderr() logger.setLevel(logging.INFO) class MyListManager(multiprocessing.managers.BaseManager): pass MyListManager.register("syncarr") def main(): manager = MyListManager(address=('/tmp/mypipe'), authkey='') manager.connect() syncarr = manager.syncarr() print "arr = %s" % (dir(syncarr)) # note here we need not bother with __str__ # syncarr can be printed as a list without a problem: print "List at start:", syncarr print "Changing from client" syncarr.append(30) print "List now:", syncarr o0 = None o1 = None while 1: new_0 = syncarr.__getitem__(0) # syncarr[0] new_1 = syncarr.__getitem__(1) # syncarr[1] if o0 != new_0 or o1 != new_1: print 'o0: %s => %s' % (str(o0), str(new_0)) print 'o1: %s => %s' % (str(o1), str(new_1)) print "List is:", syncarr print 'Press Ctrl-C to exit' o0 = new_0 o1 = new_1 time.sleep(1) if __name__ == '__main__': main() </code></pre> As a final remark, on Linux <code>/tmp/mypipe</code> is created - but is 0 bytes, and has attributes <code>srwxr-xr-x</code> (for a socket); I guess this makes me happy, as I neither have to worry about network ports, nor about temporary files as such <code>:)</code> Other related questions: <ul> <li> Python: Possible to share in-memory data between 2 separate processes (very good explanation)</li> <li>Efficient Python to Python IPC</li> <li>Python: Sending a variable to another script</li> </ul>

How to share variables across scripts in python?

Tags:

python

variables

The following does not work

one.py

Click to copy

import shared shared.value = 'Hello' raw_input('A cheap way to keep process alive..')

two.py

Click to copy

import shared print shared.value

run on two command lines as:

Click to copy

>>python one.py >>python two.py

(the second one gets an attribute error, rightly so).

Is there a way to accomplish this, that is, share a variable between two scripts?

692

asked Dec 01 '09 21:12

azarias

2 Answers

Hope it's OK to jot down my notes about this issue here.

First of all, I appreciate the example in the OP a lot, because that is where I started as well - although it made me think shared is some built-in Python module, until I found a complete example at [Tutor] Global Variables between Modules ??.

However, when I looked for "sharing variables between scripts" (or processes) - besides the case when a Python script needs to use variables defined in other Python source files (but not necessarily running processes) - I mostly stumbled upon two other use cases:

A script forks itself into multiple child processes, which then run in parallel (possibly on multiple processors) on the same PC
A script spawns multiple other child processes, which then run in parallel (possibly on multiple processors) on the same PC

As such, most hits regarding "shared variables" and "interprocess communication" (IPC) discuss cases like these two; however, in both of these cases one can observe a "parent", to which the "children" usually have a reference.

What I am interested in, however, is running multiple invocations of the same script, ran independently, and sharing data between those (as in Python: how to share an object instance across multiple invocations of a script), in a singleton/single instance mode. That kind of problem is not really addressed by the above two cases - instead, it essentially reduces to the example in OP (sharing variables across two scripts).

Now, when dealing with this problem in Perl, there is IPC::Shareable; which "allows you to tie a variable to shared memory", using "an integer number or 4 character string[1] that serves as a common identifier for data across process space". Thus, there are no temporary files, nor networking setups - which I find great for my use case; so I was looking for the same in Python.

However, as accepted answer by @Drewfer notes: "You're not going to be able to do what you want without storing the information somewhere external to the two instances of the interpreter"; or in other words: either you have to use a networking/socket setup - or you have to use temporary files (ergo, no shared RAM for "totally separate python sessions").

Now, even with these considerations, it is kinda difficult to find working examples (except for pickle) - also in the docs for mmap and multiprocessing. I have managed to find some other examples - which also describe some pitfalls that the docs do not mention:

Usage of mmap: working code in two different scripts at Sharing Python data between processes using mmap | schmichael's blog
- Demonstrates how both scripts change the shared value
- Note that here a temporary file is created as storage for saved data - mmap is just a special interface for accessing this temporary file
Usage of multiprocessing: working code at:
- Python multiprocessing RemoteManager under a multiprocessing.Process - working example of SyncManager (via manager.start()) with shared Queue; server(s) writes, clients read (shared data)
- Comparison of the multiprocessing module and pyro? - working example of BaseManager (via server.serve_forever()) with shared custom class; server writes, client reads and writes
- How to synchronize a python dict with multiprocessing - this answer has a great explanation of multiprocessing pitfalls, and is a working example of SyncManager (via manager.start()) with shared dict; server does nothing, client reads and writes

Thanks to these examples, I came up with an example, which essentially does the same as the mmap example, with approaches from the "synchronize a python dict" example - using BaseManager (via manager.start() through file path address) with shared list; both server and client read and write (pasted below). Note that:

multiprocessing managers can be started either via manager.start() or server.serve_forever()
- serve_forever() locks - start() doesn't
- There is auto-logging facility in multiprocessing: it seems to work fine with start()ed processes - but seems to ignore the ones that serve_forever()
The address specification in multiprocessing can be IP (socket) or temporary file (possibly a pipe?) path; in multiprocessing docs:
- Most examples use multiprocessing.Manager() - this is just a function (not class instantiation) which returns a SyncManager, which is a special subclass of BaseManager; and uses start() - but not for IPC between independently ran scripts; here a file path is used
- Few other examples serve_forever() approach for IPC between independently ran scripts; here IP/socket address is used
- If an address is not specified, then an temp file path is used automatically (see 16.6.2.12. Logging for an example of how to see this)

In addition to all the pitfalls in the "synchronize a python dict" post, there are additional ones in case of a list. That post notes:

All manipulations of the dict must be done with methods and not dict assignments (syncdict["blast"] = 2 will fail miserably because of the way multiprocessing shares custom objects)

The workaround to dict['key'] getting and setting, is the use of the dict public methods get and update. The problem is that there are no such public methods as alternative for list[index]; thus, for a shared list, in addition we have to register __getitem__ and __setitem__ methods (which are private for list) as exposed, which means we also have to re-register all the public methods for list as well :/

Well, I think those were the most critical things; these are the two scripts - they can just be ran in separate terminals (server first); note developed on Linux with Python 2.7:

a.py (server):

Click to copy

import multiprocessing import multiprocessing.managers  import logging logger = multiprocessing.log_to_stderr() logger.setLevel(logging.INFO)   class MyListManager(multiprocessing.managers.BaseManager):     pass   syncarr = [] def get_arr():     return syncarr  def main():      # print dir([]) # cannot do `exposed = dir([])`!! manually:     MyListManager.register("syncarr", get_arr, exposed=['__getitem__', '__setitem__', '__str__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'])      manager = MyListManager(address=('/tmp/mypipe'), authkey='')     manager.start()      # we don't use the same name as `syncarr` here (although we could);     # just to see that `syncarr_tmp` is actually <AutoProxy[syncarr] object>     # so we also have to expose `__str__` method in order to print its list values!     syncarr_tmp = manager.syncarr()     print("syncarr (master):", syncarr, "syncarr_tmp:", syncarr_tmp)     print("syncarr initial:", syncarr_tmp.__str__())      syncarr_tmp.append(140)     syncarr_tmp.append("hello")      print("syncarr set:", str(syncarr_tmp))      raw_input('Now run b.py and press ENTER')      print     print 'Changing [0]'     syncarr_tmp.__setitem__(0, 250)      print 'Changing [1]'     syncarr_tmp.__setitem__(1, "foo")      new_i = raw_input('Enter a new int value for [0]: ')     syncarr_tmp.__setitem__(0, int(new_i))      raw_input("Press any key (NOT Ctrl-C!) to kill server (but kill client first)".center(50, "-"))     manager.shutdown()  if __name__ == '__main__':   main()

b.py (client)

Click to copy

import time  import multiprocessing import multiprocessing.managers  import logging logger = multiprocessing.log_to_stderr() logger.setLevel(logging.INFO)   class MyListManager(multiprocessing.managers.BaseManager):     pass  MyListManager.register("syncarr")  def main():   manager = MyListManager(address=('/tmp/mypipe'), authkey='')   manager.connect()   syncarr = manager.syncarr()    print "arr = %s" % (dir(syncarr))    # note here we need not bother with __str__    # syncarr can be printed as a list without a problem:   print "List at start:", syncarr   print "Changing from client"   syncarr.append(30)   print "List now:", syncarr    o0 = None   o1 = None    while 1:     new_0 = syncarr.__getitem__(0) # syncarr[0]     new_1 = syncarr.__getitem__(1) # syncarr[1]      if o0 != new_0 or o1 != new_1:       print 'o0: %s => %s' % (str(o0), str(new_0))       print 'o1: %s => %s' % (str(o1), str(new_1))       print "List is:", syncarr        print 'Press Ctrl-C to exit'       o0 = new_0       o1 = new_1      time.sleep(1)   if __name__ == '__main__':     main()

As a final remark, on Linux /tmp/mypipe is created - but is 0 bytes, and has attributes srwxr-xr-x (for a socket); I guess this makes me happy, as I neither have to worry about network ports, nor about temporary files as such :)

sdaau

You're not going to be able to do what you want without storing the information somewhere external to the two instances of the interpreter.
If it's just simple variables you want, you can easily dump a python dict to a file with the pickle module in script one and then re-load it in script two. Example:

one.py

Click to copy

import pickle  shared = {"Foo":"Bar", "Parrot":"Dead"} fp = open("shared.pkl","w") pickle.dump(shared, fp)

two.py

Click to copy

import pickle  fp = open("shared.pkl") shared = pickle.load(fp) print shared["Foo"]

answered Oct 19 '22 10:10

Drewfer

Related questions
                            
                                The pythonic way to generate pairs
                            
                                matplotlib label doesn't work
                            
                                Fastest Way to Drop Duplicated Index in a Pandas DataFrame [duplicate]
                            
                                Why should exec() and eval() be avoided?
                            
                                Concatenating Tuple
                            
                                "Models aren't loaded yet" error while populating in Django 1.8 or later
                            
                                Way to change Google Chrome user agent in Selenium?
                            
                                linear programming in python?
                            
                                How do I create an incrementing filename in Python?
                            
                                Django internationalization language codes [closed]
                            
                                No Module Named ServerSocket
                            
                                Python initialize multiple variables to the same initial value
                            
                                How to loop through 2D numpy array using x and y coordinates without getting out of bounds error?
                            
                                Django REST framework: method PUT not allowed in ViewSet with def update()
                            
                                How can I get the list of only folders in amazon S3 using python boto?
                            
                                Unzipping and the * operator
                            
                                Python 3 CSV file giving UnicodeDecodeError: 'utf-8' codec can't decode byte error when I print
                            
                                Django formsets: make first required?
                            
                                Randomly change the prompt in the Python interpreter
                            
                                ImportError: No module named bz2 for Python 2.7.2

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to share variables across scripts in python?

Tags:

python

variables

share

azarias

People also ask

2 Answers

sdaau

Drewfer

Recent Activity

Donate For Us