Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CSV file upload from buffer to S3

I am trying to upload content taken out of a model in Django as a csv file. I don't want to save the file locally, but keep it in the buffer and upload to s3. Currently, this code does not error as is, and uploads the file properly, however, the file is empty.

file_name='some_file.csv'
fields = [list_of_fields]
header = [header_fields]

buff =  io.StringIO()
writer = csv.writer(buff, dialect='excel', delimiter=',')
writer.writerow(header)
for value in some_queryset:
    row = []
    for field in fields:
        # filling in the row
    writer.writerow(row)

# Upload to s3
client = boto3.client('s3')
bucket = 'some_bucket_name'
date_time = datetime.datetime.now()
date = date_time.date()
time = date_time.time()
dt = '{year}_{month}_{day}__{hour}_{minute}_{second}'.format(
    day=date.day,
    hour=time.hour,
    minute=time.minute,
    month=date.month,
    second=time.second,
    year=date.year,
)
key = 'some_name_{0}.csv'.format(dt)

client.upload_fileobj(buff, bucket, key)

If I take the buffer's content, it is definitely writing it:

content = buff.getvalue()
content.encode('utf-8')
print("content: {0}".format(content)) # prints the csv content

EDIT: I am doing a similar thing with a zip file, created in a buffer:

with zipfile.ZipFile(buff, 'w') as archive:

Writing to the archive (adding pdf files that I am generating), and once I am done, I execute this: buff.seek(0) which seems to be necessary. If I do a similar thing above, it will error out: Unicode-objects must be encoded before hashing

like image 872
camelBack Avatar asked Aug 15 '17 19:08

camelBack


People also ask

How many ways you can upload data to S3?

There are three ways in which you can upload a file to amazon S3.


1 Answers

Okay, disregard my earlier answer, I found the actual problem.

According to the boto3 documentation for the upload_fileobj function, the first parameter (Fileobj) needs to implement a read() method that returns bytes:

Fileobj (a file-like object) -- A file-like object to upload. At a minimum, it must implement the read method, and must return bytes.

The read() function on a _io.StringIO object returns a string, not bytes. I would suggest swapping the StringIO object for a BytesIO object, adding in the necessary encoding and decoding.

Here is a minimal working example. It's not the most efficient solution - the basic idea is to copy the contents over to a second BytesIO object.

import io
import boto3
import csv

buff = io.StringIO()

writer = csv.writer(buff, dialect='excel', delimiter=',')
writer.writerow(["a", "b", "c"])

buff2 = io.BytesIO(buff.getvalue().encode())

bucket = 'changeme'
key = 'blah.csv'

client = boto3.client('s3')
client.upload_fileobj(buff2, bucket, key)
like image 173
Thomite Avatar answered Oct 10 '22 14:10

Thomite