Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Paramiko hang if you use it while loading a module?

Put the following into a file hello.py (and easy_install paramiko if you haven't got it):

hostname,username,password='fill','these','in'
import paramiko
c = paramiko.SSHClient()
c.set_missing_host_key_policy(paramiko.AutoAddPolicy())
c.connect(hostname=hostname, username=username, password=password)
i,o,e = c.exec_command('ls /')
print(o.read())
c.close()

Fill in the first line appropriately.

Now type

python hello.py

and you'll see some ls output.

Now instead type

python

and then from within the interpreter type

import hello

and voila! It hangs! It will unhang if you wrap the code in a function foo and do import hello; hello.foo() instead.

Why does Paramiko hang when used within module initialization? How is Paramiko even aware that it's being used during module initialization in the first place?

like image 879
Michael Gundlach Avatar asked Jan 14 '09 15:01

Michael Gundlach


2 Answers

As JimB pointed out it is an import issue when python tries to implicitly import the str.decode('utf-8') decoder on first use during an ssh connection attempt. See Analysis section for details.

In general, one cannot stress enough that you should avoid having a module automatically spawning new threads on import. If you can, try to avoid magic module code in general as it almost always leads to unwanted side-effects.

  1. The easy - and sane - fix for your problem - as already mentioned - is to put your code in a if __name__ == '__main__': body which will only be executed if you execute this specific module and wont be executed when this mmodule is imported by other modules.

  2. (not recommended) Another fix is to just do a dummy str.decode('utf-8') in your code before you call SSHClient.connect() - see analysis below.

So whats the root cause of this problem?

Analysis (simple password auth)

Hint: If you want to debug threading in python import and set threading._VERBOSE = True

  1. paramiko.SSHClient().connect(.., look_for_keys=False, ..) implicitly spawns a new thread for your connection. You can also see this if you turn on debug output for paramiko.transport.

[Thread-5 ] [paramiko.transport ] DEBUG : starting thread (client mode): 0x317f1d0L

  1. this is basically done as part of SSHClient.connect(). When client.py:324::start_client() is called, a lock is created transport.py:399::event=threading.Event() and the thread is started transport.py:400::self.start(). Note that the start() method will then execute the class's transport.py:1565::run() method.

  2. transport.py:1580::self._log(..) prints the our log message "starting thread" and then proceeds to transport.py:1584::self._check_banner().

  3. check_banner does one thing. It retrieves the ssh banner (first response from server) transport.py:1707::self.packetizer.readline(timeout) (note that the timeout is just a socket read timeout), checks for a linefeed at the end and otherwise times out.

  4. In case a server banner was received, it attempts to utf-8 decode the response string packet.py:287::return u(buf) and thats where the deadlock happens. The u(s, encoding='utf-8') does a str.decode('utf-i') and implicitly imports encodings.utf8 in encodings:99 via encodings.search_function ending up in an import deadlock.

So a dirty fix would be to just import the utf-8 decoder once in order to not block on that specifiy import due to module import sideeffects. (''.decode('utf-8'))

Fix

dirty fix - not recommended

import paramiko
hostname,username,password='fill','these','in'
''.decode('utf-8')  # dirty fix
c = paramiko.SSHClient()
c.set_missing_host_key_policy(paramiko.AutoAddPolicy())
c.connect(hostname=hostname, username=username, password=password)
i,o,e = c.exec_command('ls /')
print(o.read())
c.close()

good fix

import paramiko
if __name__ == '__main__':
    hostname,username,password='fill','these','in'
    c = paramiko.SSHClient()
    c.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    c.connect(hostname=hostname, username=username, password=password)
    i,o,e = c.exec_command('ls /')
    print(o.read())
    c.close()

ref paramiko issue tracker: issue 104

like image 81
tintin Avatar answered Sep 18 '22 20:09

tintin


Paramiko uses separate threads for the underlying transport. You should never have a module that spawns a thread as a side effect of importing. As I understand it, there is a single import lock available, so when a child thread from your module attempts another import, it can block indefinitely, because your main thread still holds the lock. (There are probably other gotchas that I'm not aware of too)

In general, modules shouldn't have side effects of any sort when importing, or you're going to get unpredictable results. Just hold off execution with the __name__ == '__main__' trick, and you'll be fine.

[EDIT] I can't seem to create a simple test case that reproduces this deadlock. I still assume it's a threading issue with import, because the auth code is waiting for an event that never fires. This may be a bug in paramiko, or python, but the good news is that you shouldn't ever see it if you do things correctly ;)

This is a good example why you always want to minimize side effects, and why functional programming techniques are becoming more prevalent.

like image 33
JimB Avatar answered Sep 18 '22 20:09

JimB