Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Paramiko channel stucks when reading large ouput

I have a code where i am executing a command on remote Linux machine and reading the output using Paramiko. The code def looks like this:

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(IPAddress, username=user['username'], password=user['password'])


chan = self.ssh.get_transport().open_session()

chan.settimeout(10800)

try:
    # Execute thecommand
    chan.exec_command(cmd)

    contents = StringIO.StringIO()

    data = chan.recv(1024)

    # Capturing data from chan buffer.
    while data:
        contents.write(data)
        data = chan.recv(1024)

except socket.timeout:
    raise socket.timeout


output = contents.getvalue()

return output,chan.recv_stderr(600),chan.recv_exit_status()

The above code works for small outputs, but it gets stuck for larger outputs.

Is there any buffer related issue in here?

like image 450
vipulb Avatar asked Feb 01 '13 10:02

vipulb


5 Answers

TL;DR: Call stdout.readlines() before stderr.readlines() if using ssh.exec_command()

If you use @Spencer Rathbun's answer:

sh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(IPAddress, username=user['username'], password=user['password'])

stdin, stdout, stderr = ssh.exec_command(cmd)

You might want to be aware of the limitations that can arise from having large outputs.

Experimentally, stdin, stdout, stderr = ssh.exec_command(cmd) will not be able to write the full output immediately to stdout and stderr. More specifically, a buffer appears to hold 2^21 (2,097,152) characters before filling up. If any buffer is full, exec_command will block on writing to that buffer, and will stay blocked until that buffer is emptied enough to continue. This means that if your stdout is too large, you'll hang on reading stderr, as you won't receive EOF in either buffer until it can write the full output.

The easy way around this is the one Spencer uses - get all the normal output via stdout.readlines() before trying to read stderr. This will only fail if you have more than 2^21 characters in stderr, which is an acceptable limitation in my use case.

I'm mainly posting this because I'm dumb and spent far, far too long trying to figure out how I broke my code, when the answer was that I was reading from stderr before stdout and my stdout was too big to fit in the buffer.

like image 96
jeremysprofile Avatar answered Sep 22 '22 07:09

jeremysprofile


I am posting the final code which worked with inputs from Bruce Wayne( :) )

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(IPAddress, username=user['username'], password=user['password'])

chan = self.ssh.get_transport().open_session()
chan.settimeout(10800)

try:
    # Execute the given command
    chan.exec_command(cmd)

    # To capture Data. Need to read the entire buffer to capture output
    contents = StringIO.StringIO()
    error = StringIO.StringIO()

    while not chan.exit_status_ready():
        if chan.recv_ready():
            data = chan.recv(1024)
            #print "Indside stdout"
            while data:
                contents.write(data)
                data = chan.recv(1024)

        if chan.recv_stderr_ready():            
            error_buff = chan.recv_stderr(1024)
            while error_buff:
                error.write(error_buff)
                error_buff = chan.recv_stderr(1024)

    exit_status = chan.recv_exit_status()

except socket.timeout:
    raise socket.timeout

output = contents.getvalue()
error_value = error.getvalue()

return output, error_value, exit_status
like image 44
vipulb Avatar answered Oct 29 '22 19:10

vipulb


It's easier if you use the high level representation of an open ssh session. Since you already use ssh-client to open your channel, you can just run your command from there, and avoid the extra work.

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(IPAddress, username=user['username'], password=user['password'])

stdin, stdout, stderr = ssh.exec_command(cmd)
for line in stdout.readlines():
    print line
for line in stderr.readlines():
    print line

You will need to come back and read from these files handles again if you receive additional data afterwards.

like image 35
Spencer Rathbun Avatar answered Oct 29 '22 21:10

Spencer Rathbun


i see no problem related to stdout channel, but i'm not sure about the way you are handling stderr. Can you confirm, its not the stderr capturing thats causing problem? I'll try out your code and let you know.

Update: when a command you execute gives lots of messages in STDERR, your code freezes. I'm not sure why, but recv_stderr(600) might be the reason. So capture error stream the same way you capture standard output. something like,

contents_err = StringIO.StringIO()

data_err = chan.recv_stderr(1024)
while data_err:
    contents_err.write(data_err)
    data_err = chan.recv_stderr(1024)

you may even first try and change recv_stderr(600) to recv_stderr(1024) or higher.

like image 3
bruce_w Avatar answered Oct 29 '22 19:10

bruce_w


Actually I think all above answers can't resolve the real problem:

if the remote program produce large amount of stderr output first then

stdout.readlines()
stderr.readlines()

would hung forever. although

stderr.readlines()
stdout.readlines()

would resolve this case, but it will fail in case the remote program produce large amount of stdout output first.

I don't have a solution yet...

like image 2
fubupc Avatar answered Oct 29 '22 19:10

fubupc