I want to download files from remote server using Paramiko with multithreading.
There are two solution came into my mind, but I'm not sure which is right (or better).
Solution 1:
Assuming that the SFTPClient.get
is thread safe (But I can't find any document mentioned that), a simple one would as:
from paramiko import SSHClient, AutoAddPolicy, SFTPClient
from concurrent.futures import ThreadPoolExecutor
from typing import List
client = SSHClient()
ciient.set_missing_host_key_policy(AutoAddPolicy())
client.connect( ... )
sftp = client.open_sftp()
files_to_download: List[str] = ...
with ThreadPoolExecutor(10) as pool:
pool.map(lambda fn: sftp.get(fn, fn), files_to_download)
Solution 2: There are two questions in Solution 1
So here is my second solution:
from paramiko import SSHClient, AutoAddPolicy, SFTPClient
from concurrent.futures import ThreadPoolExecutor
from threading import Lock, local
from typing import List
client = SSHClient()
ciient.set_missing_host_key_policy(AutoAddPolicy())
client.connect( ... )
thread_local = local()
thread_lock = Lock()
files_to_download: List[str] = ...
def download(fn: str) -> None:
"""
thread-safe and each thread has its own SFTPClient
"""
if not hasattr(thread_local, 'sftp'):
with thread_lock:
thread_local.sftp = client.open_sftp()
thread_local.sftp.get(fn, fn)
with ThreadPoolExecutor(10) as pool:
pool.map(download, files_to_download)
Which solution is better?
Paramiko is not thread safe.
Using multiple threads over one connection might not give you the performance you hope for anyway. You would have to open a separate connection (SSHClient
/SFTPClient
) per thread.
With one connection, you would have better performance, only with scenarios like a transfer of large amount of small files. For that, see Slow upload of many small files with SFTP.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With