I am looking to transfer the contents of a folder from an ftp server to a bucket in s3 without writing to disk. Currently, s3 is getting all of the names of the files in the folder, but none of the actual data. Each file in the folder is only a few bytes. I'm not quite sure why it is not uploading the whole file.
from ftplib import FTP
import io
import boto3
s3= boto3.resource('s3')
ftp = FTP('ftp.ncbi.nlm.nih.gov')
ftp.login()
ftp.cwd('pubchem/RDF/descriptor/compound')
address = 'ftp.ncbi.nlm.nih.gov/pubchem/RDF/descriptor/compound/'
filelist = ftp.nlst()
for x in range(0, len(filelist)-1):
myfile = io.BytesIO()
filename = 'RETR ' + filelist[x]
resp = ftp.retrbinary(filename, myfile.write)
myfile.seek(0)
path = address + filelist[x]
#putting file on s3
s3.Object(s3bucketname, path).put(Body = resp)
ftp.quit()
Is there any way to make sure the whole file is uploaded?
We can transfer the data from FTP server to S3 via stream using Python. The data won't download in /tmp location in AWS Lambda. It will directly stream the data from FTP to S3 bucket.
from ftplib import FTP
import s3fs
def lambda_handler(event, context):
file_name = "test.txt" #file name in ftp
s3 = s3fs.S3FileSystem(anon=False)
ftp_path = "<ftp_path>"
s3_path = "s3-dev" #S3 bucket name
with FTP("<ftp_server>") as ftp:
ftp.login()
ftp.cwd(ftp_path)
ftp.retrbinary('RETR ' + file_name, s3.open("{}/{}".format(s3_path, file_name), 'wb').write)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With