This question is more fact finding and thought process than code oriented.
I have many compiled C++ programs that I need to run at different times and with different parameters. I'm looking at using Python multiprocessing to read a job from job queue (rabbitmq) and then feed that job to a C++ program to run (maybe subprocess). I was looking at the multiprocessing module because this will all run on dual Xeon server so I want to take full advantage of the multiprocessor ability of my server.
The Python program would be the central manager and would simply read jobs from the queue, spawn a process (or subprocess?) with the appropriate C++ program to run the job, get the results (subprocess stdout & stderr), feed that to a callback and put the process back in a queue of processes waiting for the next job to run.
First, does this sound like a valid strategy?
Second, are there any type of examples of something similar to this?
Thank you in advance.
The Python program would be the central manager and would simply read jobs from the que, spawn a process (or subprocess?) with the appropriate C++ program to run the job, get the results (subprocess stdout & stderr), feed that to a callback and put the process back in a que of processes waiting for the next job to run.
You don't need the multiprocessing
module for this. The multiprocessing
module is good for running Python functions as separate processes. To run a C++ program and read results from stdout, you'd only need the subprocess
module. The queue could be a list, and your Python program would simply loop while the list is non-empty.
However, if you want to
then you could do it with multiprocessing
like this:
test.py:
import multiprocessing as mp
import subprocess
import shlex
def worker(q):
while True:
# Get an argument from the queue
x=q.get()
# You might change this to run your C++ program
proc=subprocess.Popen(
shlex.split('test2.py {x}'.format(x=x)),stdout=subprocess.PIPE)
out,err=proc.communicate()
print('{name}: using argument {x} outputs {o}'.format(
x=x,name=mp.current_process().name,o=out))
q.task_done()
# Put a new argument into the queue
q.put(int(out))
def main():
q=mp.JoinableQueue()
# Put some initial values into the queue
for t in range(1,3):
q.put(t)
# Create and start a pool of worker processes
for i in range(3):
p=mp.Process(target=worker, args=(q,))
p.daemon=True
p.start()
q.join()
print "Finished!"
if __name__=='__main__':
main()
test2.py (a simple substitute for your C++ program):
import time
import sys
x=int(sys.argv[1])
time.sleep(0.5)
print(x+3)
Running test.py
might yield something like this:
Process-1: using argument 1 outputs 4
Process-3: using argument 3 outputs 6
Process-2: using argument 2 outputs 5
Process-3: using argument 6 outputs 9
Process-1: using argument 4 outputs 7
Process-2: using argument 5 outputs 8
Process-3: using argument 9 outputs 12
Process-1: using argument 7 outputs 10
Process-2: using argument 8 outputs 11
Process-1: using argument 10 outputs 13
Notice that the numbers in the right-hand column are fed back into the queue, and are (eventually) used as arguments to test2.py
and show up as numbers in the left-hard column.
First, does this sound like a valid strategy?
Yes.
Second, are there any type of examples of something similar to this?
Celery
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With