I'm trying to write a pandas dataframe as a pickle file into an s3 bucket in AWS. I know that I can write dataframe new_df
as a csv to an s3 bucket as follows:
bucket='mybucket' key='path' csv_buffer = StringIO() s3_resource = boto3.resource('s3') new_df.to_csv(csv_buffer, index=False) s3_resource.Object(bucket,path).put(Body=csv_buffer.getvalue())
I've tried using the same code as above with to_pickle()
but with no success.
You can write a file or data to S3 Using Boto3 using the Object. put() method. Other methods available to write a file to s3 are, Object.
S3 is just storage. Whatever file you upload is the file that is stored. You cannot upload a zip file then extract it once its in S3.
Navigate to All Settings > Raw Data Export > CSV Upload. Toggle the switch to ON. Select Amazon S3 Bucket from the dropdown menu. Enter your Access Key ID, Secret Access Key, and bucket name.
Further to you answer, you don't need to convert to csv. pickle.dumps method returns a byte obj. see here: https://docs.python.org/3/library/pickle.html
import boto3 import pickle bucket='your_bucket_name' key='your_pickle_filename.pkl' pickle_byte_obj = pickle.dumps([var1, var2, ..., varn]) s3_resource = boto3.resource('s3') s3_resource.Object(bucket,key).put(Body=pickle_byte_obj)
I've found the solution, need to call BytesIO into the buffer for pickle files instead of StringIO (which are for CSV files).
import io import boto3 pickle_buffer = io.BytesIO() s3_resource = boto3.resource('s3') new_df.to_pickle(pickle_buffer) s3_resource.Object(bucket, key).put(Body=pickle_buffer.getvalue())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With