Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Write Temp File to S3

I am currently trying to write a dataframe to a temp file and then upload that temp file into an S3 bucket. When I run my code there currently isn't any action that occurs. Any help would be greatly appreciated. The following is my code:

import csv
import pandas as pd
import boto3
import tempfile
import os 

s3 = boto3.client('s3', aws_access_key_id = access_key, aws_secret_access_key = secret_key, region_name = region)

temp = tempfile.TemporaryFile()
largedf.to_csv(temp, sep = '|')
s3.put_object(temp, Bucket = '[BUCKET NAME]', Key = 'test.txt')
temp.close()
like image 614
jumpman23 Avatar asked Dec 19 '22 04:12

jumpman23


1 Answers

The file-handle you pass to the s3.put_object is at the final position, when you .read from it, it will return an empty string.

>>> df = pd.DataFrame(np.random.randint(10,50, (5,5)))
>>> temp = tempfile.TemporaryFile(mode='w+')
>>> df.to_csv(temp)
>>> temp.read()
''

A quick fix is to .seek back to the beginning...

>>> temp.seek(0)
0
>>> print(temp.read())
,0,1,2,3,4
0,11,42,40,45,11
1,36,18,45,24,25
2,28,20,12,33,44
3,45,39,14,16,20
4,40,16,22,30,37

Note, writing to disk is unnecessary, really, you could just keep everything in memory using a buffer, something like:

from io import StringIO # on python 2, use from cStringIO import StringIO
buffer = StringIO()

# Saving df to memory as a temporary file
df.to_csv(buffer)
buffer.seek(0)
s3.put_object(buffer, Bucket = '[BUCKET NAME]', Key = 'test.txt')
like image 195
juanpa.arrivillaga Avatar answered Dec 20 '22 20:12

juanpa.arrivillaga