Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do dynamic creation of per-process queues in Python multiprocessing

I want to dynamically create multiple Processes, where each instance has a queue for incoming messages from other instances, and each instance can also create new instances. So we end up with a network of processes all sending to each other. Every instance is allowed to send to every other.

The code below would do what I want: it uses a Manager.dict() to store the queues, making sure updates are propagated, and a Lock() to protect write-access to the queues. However when adding a new queue it throws "RuntimeError: Queue objects should only be shared between processes through inheritance".

The problem is that when starting-up, we don't know how many queues will eventually be needed, so we have to create them dynamically. But since we can't share queues except at construction time, I don't know how to do that.

I know that one possibility would be to make queues a global variable instead of a managed one passed-in to __init__: the problem then, as I understand it, is that additions to the queues variable wouldn't be propagated to other processes.

EDIT I'm working on evolutionary algorithms. EAs are a type of machine learning technique. An EA simulates a "population", which evolves by survival of the fittest, crossover, and mutation. In parallel EAs, as here, we also have migration between populations, corresponding to interprocess communication. Islands can also spawn new islands, and so we need a way to send messages between dynamically-created processes.

import random, time
from multiprocessing import Process, Queue, Lock, Manager, current_process
try:
    from queue import Empty as EmptyQueueException
except ImportError:
    from Queue import Empty as EmptyQueueException

class MyProcess(Process):
    def __init__(self, queues, lock):
        super(MyProcess, self).__init__(target=lambda x: self.run(x),
                                     args=tuple())
        self.queues = queues
        self.lock = lock
        # acquire lock and add a new queue for this process
        with self.lock:
            self.id = len(list(self.queues.keys()))
            self.queues[self.id] = Queue()

    def run(self):
        while len(list(self.queues.keys())) < 10:

            # make a new process
            new = MyProcess(self.lock)
            new.start()

            # send a message to a random process
            dest_key = random.choice(list(self.queues.keys()))
            dest = self.queues[dest_key]
            dest.put("hello to %s from %s" % (dest_key, self.id))

            # receive messages
            message = True
            while message:
                try:
                    message = self.queues[self.id].get(False) # don't block
                    print("%s received: %s" % (self.id, message))
                except EmptyQueueException:
                    break

            # what queues does this process know about?
            print("%d: I know of %s" %
                  (self.id, " ".join([str(id) for id in self.queues.keys()])))

            time.sleep(1)

if __name__ == "__main__":
    # Construct MyProcess with a Manager.dict for storing the queues
    # and a lock to protect write access. Start.
    MyProcess(Manager().dict(), Lock()).start()
like image 764
jmmcd Avatar asked Aug 16 '11 02:08

jmmcd


1 Answers

I'm not entirely sure what your use case actually is here. Perhaps if you elaborate a bit more on why you want to have each process dynamically spawn a child with a connected queue it'll be a bit more clear what the right solution would be in this situation.

Anyway, with the question as is it seems that there is not really a good way to dynamically create pipes or queues with Multiprocessing right now.

I think that if you're willing to spawn threads within each of your processes you may be able to use multiprocessing.connection.Listener/Client to communicate back and forth. Rather than spawning threads I took an approach using network sockets and select to communicate between threads.

Dynamic process spawning and network sockets may still be flaky depending on how multiprocessing cleans up your file descriptors when spawning/forking a new process and your solution will most likely work more easily on *nix derivatives. If you're concerned about socket overhead you could use unix domain sockets to be a little more lightweight at the cost of added complexity running nodes on multiple worker machines.

Anyway, here's an example using network sockets and a global process list to accomplish this since I was unable to find a good way to make multiprocessing do it.

import collections
import multiprocessing
import random
import select
import socket
import time


class MessagePassingProcess(multiprocessing.Process):
    def __init__(self, id_, processes):
        self.id = id_
        self.processes = processes
        self.queue = collections.deque()
        super(MessagePassingProcess, self).__init__()

    def run(self):
        print "Running"
        inputs = []
        outputs = []
        server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        address = self.processes[self.id]["address"]
        print "Process %s binding to %s"%(self.id, address)
        server.bind(address)
        server.listen(5)
        inputs.append(server)
        process = self.processes[self.id]
        process["listening"] = True
        self.processes[self.id] = process
        print "Process %s now listening!(%s)"%(self.id, process)
        while inputs:
            readable, writable, exceptional = select.select(inputs,
                                                           outputs,
                                                           inputs,
                                                           0.1)
            for sock in readable:
                print "Process %s has a readable scoket: %s"%(self.id,
                                                              sock)
                if sock is server:
                    print "Process %s has a readable server scoket: %s"%(self.id,
                                                              sock)
                    conn, addr = sock.accept()
                    conn.setblocking(0)
                    inputs.append(conn)
                else:
                    data = sock.recv(1024)
                    if data:
                        self.queue.append(data)
                        print "non server readable socket with data"
                    else:
                        inputs.remove(sock)
                        sock.close()
                        print "non server readable socket with no data"

            for sock in exceptional:
                print "exception occured on socket %s"%(sock)
                inputs.remove(sock)
                sock.close()

            while len(self.queue) >= 1:
                print "Received:", self.queue.pop()

            # send a message to a random process:
            random_id = random.choice(list(self.processes.keys()))
            print "%s Attempting to send message to %s"%(self.id, random_id)
            random_process = self.processes[random_id]
            print "random_process:", random_process
            if random_process["listening"]:
                random_address = random_process["address"]
                s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
                try:
                    s.connect(random_address)
                except socket.error:
                    print "%s failed to send to %s"%(self.id, random_id)
                else:
                    s.send("Hello World!")                    
                finally:
                    s.close()

            time.sleep(1)

if __name__=="__main__":
    print "hostname:", socket.getfqdn()
    print dir(multiprocessing)
    manager = multiprocessing.Manager()
    processes = manager.dict()
    joinable = []
    for n in xrange(multiprocessing.cpu_count()):
        mpp = MessagePassingProcess(n, processes)
        processes[n] = {"id":n,
                        "address":("127.0.0.1",7000+n),
                        "listening":False,
                        }
        print "processes[%s] = %s"%(n, processes[n])
        mpp.start()
        joinable.append(mpp)
    for process in joinable:
        process.join()

With a lot of polish and testing love this might be a logical extension to multiprocessing.Process and/or multiprocessing.Pool as this does seem like something people would use if it were available in the standard lib. It may also be reasonable to create a DynamicQueue class that uses sockets to be discoverable to other queues.

Anyway, hope it helps. Please update if you figure out a better way to make this work.

like image 174
stderr Avatar answered Oct 05 '22 04:10

stderr