I am trying to remote read a netcdf file.
I used Paramiko package to read my file, like this:
import paramiko
from netCDF4 import Dataset
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(hostname=’hostname’, username=’usrname’, password=’mypassword’)
sftp_client = client.open_sftp()
ncfile = sftp_client.open('mynetCDFfile')
b_ncfile = ncfile.read() # ****
nc = Dataset('test.nc', memory=b_ncfile)
But the run speed of ncfile.read()
is VERY SLOW.
So my question is: Is there any alternative way to read a netcdf file remotely, or is there any approach to speed up paramiko.sftp_file.SFTPFile.read()
?
SSH client & key policies class paramiko.client. SSHClient. A high-level representation of a session with an SSH server. This class wraps Transport , Channel , and SFTPClient to take care of most aspects of authenticating and opening channels.
Since SFTP doesn't really have the concept of a current working directory, this is emulated by Paramiko. Once you use this method to set a working directory, all operations on this SFTPClient object will be relative to that path. You can pass in None to stop using a current working directory.
Paramiko is a Python library that makes a connection with a remote device through SSh. Paramiko is using SSH2 as a replacement of SSL to make a secure connection between two devices. It also supports the SFTP client and server model.
The python package paramiko was scanned for known vulnerabilities and missing license, and no issues were found. Thus the package was deemed as safe to use.
Calling SFTPFile.prefetch
should increase the read speed:
ncfile = sftp_client.open('mynetCDFfile')
ncfile.prefetch()
b_ncfile = ncfile.read()
Another option is enabling read buffering, using bufsize
parameter of SFTPClient.open
:
ncfile = sftp_client.open('mynetCDFfile', bufsize=32768)
b_ncfile = ncfile.read()
(32768
is a value of SFTPFile.MAX_REQUEST_SIZE
)
Similarly for writes/uploads:
Writing to a file on SFTP server opened using pysftp "open" method is slow.
Yet another option is to explicitly specify the amount of data to read (it makes BufferedFile.read
take a more efficient code path):
ncfile = sftp_client.open('mynetCDFfile')
b_ncfile = ncfile.read(ncfile.stat().st_size)
If none of that works, you can download the whole file to memory instead:
Use pdfplumber and Paramiko to read a PDF file from an SFTP server
Obligatory warning: Do not use AutoAddPolicy
this way – You are losing a protection against MITM attacks by doing so. For a correct solution, see Paramiko "Unknown Server".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With