Boto3: Wait for S3 streaming upload to complete

Tags:

I'm using S3.Client.upload_fileobj() with a BytesIO stream as input to upload a file to S3 from a stream. My function should not return before the upload is finished, so I need a way to wait it.

From the documentation there is no obvious way to wait for the transfer to finish, but there are some hints of what could work:

Use the callback arg to wait until progress is at 100%. In Javascript this would be trivial using callbacks or promises, but in Python I'm not so sure.
Use a S3.Waiter object that checks if the object exists. But it does so by polling every 5s and seems very ineffective. Also I'm not sure if it would wait until the object is complete.
There's a class S3.MultipartUpload with a .complete() method, but I doubt that does what I want.
Do a loop that checks if the object is completely uploaded and if not, sleeps for a bit. But how do I check if the object is complete?

I've been googling but it seems nobody is asking the same question. Also, most results talking about related issues are using a different API (I believe upload_fileobj() is rather new).

EDIT If found out about S3.Client.put_object which also accepts a file-like object and blocks until the server responded. But would that work in combination with streams? I'm not sure how Python multithreading works here. The stream comes originally from a S3.Client.download_fileobj(), gets piped through a subprocess.Popen() and is then supposed to get uploaded back to S3. Both the download and the subprocess run in parallel threads/processes as fas as I can tell.

223

asked Feb 22 '17 04:02

cpury

People also ask

Does Put_object overwrite?

put_object` does not overwrite the existing data in the bucket.

What is Boto3 resource (' S3 ')?

Boto3 is the official AWSAWSAmazon Web Services, Inc. (AWS) is a subsidiary of Amazon that provides on-demand cloud computing platforms and APIs to individuals, companies, and governments, on a metered pay-as-you-go basis. These cloud computing web services provide distributed computing processing capacity and software tools via AWS server farms.https://en.wikipedia.org › wiki › Amazon_Web_ServicesAmazon Web Services - Wikipedia SDK for Python, used to create, configure, and manage AWS services. The following are examples of defining a resource/client in boto3 for the Weka S3 service, managing credentials, and pre-signed URLs, generating secure temporary tokens, and using those to run S3 API calls.

What are steps of uploading a file in S3 bucket?

To upload folders and files to an S3 bucketSign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/ . In the Buckets list, choose the name of the bucket that you want to upload your folders or files to. Choose Upload.

1 Answers

upload_file/upload_fileobj methods take care of the things you're looking for (i.e they wait for completion of object/file uploading).

I don't suggest 1st or 4th options. There's no need to use s3 waiter either, as upload_file/upload_fileobj methods returns only after uploading job is done.

Note that upload_file/upload_fileobj methods will automatically handle reading/writing files as well as doing multipart uploads in parallel for large files so there's no need to use multipart upload irrespective of file size.

118

answered Sep 23 '22 11:09

Venkatesh Wadawadagi

Related questions
                            
                                How to store time in protobuf 3
                            
                                jooq and java 8 streams SQL generation
                            
                                Huffman decoding (in Scala)
                            
                                Dynamically create complementary triangles of a regular grid in OpenGL
                            
                                cordova ios build plugin Failed to restore
                            
                                What is the ruby -a command line switch?
                            
                                d3js nicely transition lines with added points
                            
                                app fails to start when sleuth and zipkin are added
                            
                                Redirection to $null strange behavior in Powershell
                            
                                How to find the number of hours between two dates excluding weekends and certain holidays in Python? BusinessHours package
                            
                                where clause not working in spark sql dataframe
                            
                                Why does process creation using `clone` result in an out-of-memory failure?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With