Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

multiprocess module with paramiko

I'm trying to use the paramiko python module (1.7.7.1) to execute commands and/or xfer files to a group of remote servers in parallel. One task looks like this:

jobs = []   
for obj in appObjs:
    if obj.stop_app:
        p = multiprocessing.Process(target=exec_cmd, args=(obj, obj.stop_cmd))
        jobs.append(p)
        print "Starting job %s" % (p)
        p.start()

"obj" contains, among other things, a paramiko SSHClient, transport, and SFTPClient. The appObjs list contains approximately 25 of these objects, and thus 25 connections to 25 different servers.

I get the following error with paramiko's transport.py in the backtrace

raise AssertionError("PID check failed. RNG must be re-initialized after fork(). 
Hint:   Try Random.atfork()")

I patched /usr/lib/python2.6/site-packages/paramiko/transport.py based on the post at https://github.com/newsapps/beeswithmachineguns/issues/17 but it doesn't seem to have helped. I've verified that the transport.py in the path mentioned above is the one being used. The paramiko mailing list appears to have disappeared.

Does this look like a problem in paramiko or am I misunderstanding/misapplying the multiprocessing module? Would anyone be willing to suggest a practical workaround? Many thanks,

like image 417
murphy Avatar asked Jun 22 '11 21:06

murphy


People also ask

How does Python multiprocess work?

multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads.

What is multiprocess pool?

The Python Multiprocessing Pool class allows you to create and manage process pools in Python. Although the Multiprocessing Pool has been available in Python for a long time, it is not widely used, perhaps because of misunderstandings of the capabilities and limitations of Processes and Threads in Python.

What can you do with Paramiko?

Paramiko helps you automate repetitive system administration tasks on remote servers. More advanced Paramiko programs send the lines of a script one at a time. It does this rather than transacting all of a command, such as df or last , synchronously to completion.


1 Answers

UPDATE: As @ento notes, the forked ssh package has been merged back into paramiko so the answer below is now irrelevant and you should now being using Paramiko again.

This is a known-problem in Paramiko that has been fixed in a fork of Paramiko (stalled at version 1.7.7.1) that is now just known as the ssh package on pypi (which brings things to version 1.7.11 as of this writing).

Apparently there were problems getting some important patches into the mainline Paramiko and the maintainer was non-responsive, so @bitprophet, the maintainer of Fabric, forked Paramiko under the new package name ssh package on pypi. You can see the specific problem that you mention is discussed here and is one of the reasons he decided to fork it; you can read the gory details if you really want to.

like image 106
aculich Avatar answered Sep 28 '22 15:09

aculich