I have an AWS Lambda function which queries API and creates a dataframe, I want to write this file to an S3 bucket, I am using:
import pandas as pd
import s3fs
df.to_csv('s3.console.aws.amazon.com/s3/buckets/info/test.csv', index=False)
I am getting an error:
No such file or directory: 's3.console.aws.amazon.com/s3/buckets/info/test.csv'
But that directory exists, because I am reading files from there. What is the problem here?
I've read the previous files like this:
s3_client = boto3.client('s3')
s3_client.download_file('info', 'secrets.json', '/tmp/secrets.json')
How can I upload the whole dataframe to an S3 bucket?
To upload folders and files to an S3 bucketSign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/ . In the Buckets list, choose the name of the bucket that you want to upload your folders or files to. Choose Upload.
Navigate to All Settings > Raw Data Export > CSV Upload. Toggle the switch to ON. Select Amazon S3 Bucket from the dropdown menu. Enter your Access Key ID, Secret Access Key, and bucket name.
You can use boto3 package also for storing data to S3:
from io import StringIO # python3 (or BytesIO for python2)
import boto3
bucket = 'info' # already created on S3
csv_buffer = StringIO()
df.to_csv(csv_buffer)
s3_resource = boto3.resource('s3')
s3_resource.Object(bucket, 'df.csv').put(Body=csv_buffer.getvalue())
This
"s3.console.aws.amazon.com/s3/buckets/info/test.csv"
is not a S3 URI, you need to pass a S3 URI to save to s3. Moreover, you do not need to import s3fs (you only need it installed),
Just try:
import pandas as pd
df = pd.DataFrame()
# df.to_csv("s3://<bucket_name>/<obj_key>")
# In your case
df.to_csv("s3://info/test.csv")
NOTE: You need to create bucket on aws s3 first.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With