Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Open a remote file using paramiko in python slow

Tags:

python

file

I am using paramiko to open a remote sftp file in python. With the file object returned by paramiko, I am reading the file line by line and processing the information. This seems really slow compared to using the python in-built method 'open' from the os. Following is the code I am using to get the file object.

Using paramiko (slower by 2 times) -

client = paramiko.SSHClient()
client.load_system_host_keys()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(myHost,myPort,myUser,myPassword)
sftp = client.open_sftp()
fileObject = sftp.file(fullFilePath,'rb')

Using os -

import os
fileObject = open(fullFilePath,'rb')

Am I missing anything? Is there a way to make the paramiko fileobject read method as fast as the one using the os fileobject?

Thanks!!

like image 880
Rinks Avatar asked Sep 27 '11 02:09

Rinks


People also ask

Does Paramiko use OpenSSH?

Paramiko relies on cryptography for crypto functionality, which makes use of C and Rust extensions but has many precompiled options available. See our installation page for details. SSH is defined in RFC 4251, RFC 4252, RFC 4253 and RFC 4254. The primary working implementation of the protocol is the OpenSSH project.

Does Paramiko work on FTP?

Paramiko is an SFTP library, not FTP library. SFTP servers that do not require any credentials are almost non-existent. And even the error message you are getting suggests that you are not connecting to an SFTP server. If it is indeed an FTP server, you need to use an FTP library, like ftplib.

Is Paramiko safe?

Is paramiko safe to use? The python package paramiko was scanned for known vulnerabilities and missing license, and no issues were found. Thus the package was deemed as safe to use.

What is Paramiko SSHClient ()?

SSHClient. A high-level representation of a session with an SSH server. This class wraps Transport , Channel , and SFTPClient to take care of most aspects of authenticating and opening channels. A typical use case is: client = SSHClient() client.


2 Answers

Your problem is likely to be caused by the file being a remote object. You've opened it on the server and are requesting one line at a time - because it's not local, each request takes much longer than if the file was sitting on your hard drive. The best alternative is probably to copy the file down to a local location first, using Paramiko's SFTP get.

Once you've done that, you can open the file from the local location using os.open.

like image 120
John Lyon Avatar answered Nov 04 '22 04:11

John Lyon


I was having the same issue and I could not afford to copy the file locally because of security reasons, I solved it by using a combination of prefetching and bytesIO:

def fetch_file_as_bytesIO(sftp, path):
    """
    Using the sftp client it retrieves the file on the given path by using pre fetching.
    :param sftp: the sftp client
    :param path: path of the file to retrieve
    :return: bytesIO with the file content
    """
    with sftp.file(path, mode='rb') as file:
        file_size = file.stat().st_size
        file.prefetch(file_size)
        file.set_pipelined()
        return io.BytesIO(file.read(file_size))
like image 31
arocketman Avatar answered Nov 04 '22 04:11

arocketman