Trouble Transferring data from FTP server to S3 via stream using Python

Question

I am looking to transfer the contents of a folder from an ftp server to a bucket in s3 without writing to disk. Currently, s3 is getting all of the names of the files in the folder, but none of the actual data. Each file in the folder is only a few bytes. I'm not quite sure why it is not uploading the whole file.

from ftplib import FTP
import io 
import boto3


s3= boto3.resource('s3')

ftp = FTP('ftp.ncbi.nlm.nih.gov')
ftp.login()
ftp.cwd('pubchem/RDF/descriptor/compound')

address =  'ftp.ncbi.nlm.nih.gov/pubchem/RDF/descriptor/compound/'

filelist = ftp.nlst()

for x in range(0, len(filelist)-1):
    myfile = io.BytesIO()
    filename = 'RETR ' + filelist[x]
    resp = ftp.retrbinary(filename, myfile.write)
    myfile.seek(0)
    path = address + filelist[x]
    #putting file on s3
    s3.Object(s3bucketname, path).put(Body = resp)


ftp.quit()

Is there any way to make sure the whole file is uploaded?

vinod_vh · Accepted Answer

We can transfer the data from FTP server to S3 via stream using Python. The data won't download in /tmp location in AWS Lambda. It will directly stream the data from FTP to S3 bucket.

from ftplib import FTP
import s3fs

def lambda_handler(event, context):
    file_name = "test.txt" #file name in ftp
    s3 = s3fs.S3FileSystem(anon=False)
    ftp_path = "<ftp_path>"
    s3_path = "s3-dev" #S3 bucket name

with FTP("<ftp_server>") as ftp:
    ftp.login()
    ftp.cwd(ftp_path)
    ftp.retrbinary('RETR ' + file_name, s3.open("{}/{}".format(s3_path, file_name), 'wb').write)

Trouble Transferring data from FTP server to S3 via stream using Python

Tags:

python

amazon-s3

ftp

Satchmo

1 Answers

vinod_vh

Recent Activity

Donate For Us

Trouble Transferring data from FTP server to S3 via stream using Python

Tags:

python

amazon-s3

ftp

Satchmo

1 Answers

vinod_vh

Related questions

Recent Activity

Donate For Us