Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Complete a multipart_upload with boto3?

Tried this:

import boto3
from boto3.s3.transfer import TransferConfig, S3Transfer
path = "/temp/"
fileName = "bigFile.gz" # this happens to be a 5.9 Gig file
client = boto3.client('s3', region)
config = TransferConfig(
    multipart_threshold=4*1024, # number of bytes
    max_concurrency=10,
    num_download_attempts=10,
)
transfer = S3Transfer(client, config)
transfer.upload_file(path+fileName, 'bucket', 'key')

Result: 5.9 gig file on s3. Doesn't seem to contain multiple parts.

I found this example, but part is not defined.

import boto3

bucket = 'bucket'
path = "/temp/"
fileName = "bigFile.gz"
key = 'key'

s3 = boto3.client('s3')

# Initiate the multipart upload and send the part(s)
mpu = s3.create_multipart_upload(Bucket=bucket, Key=key)
with open(path+fileName,'rb') as data:
    part1 = s3.upload_part(Bucket=bucket
                           , Key=key
                           , PartNumber=1
                           , UploadId=mpu['UploadId']
                           , Body=data)

# Next, we need to gather information about each part to complete
# the upload. Needed are the part number and ETag.
part_info = {
    'Parts': [
        {
            'PartNumber': 1,
            'ETag': part['ETag']
        }
    ]
}

# Now the upload works!
s3.complete_multipart_upload(Bucket=bucket
                             , Key=key
                             , UploadId=mpu['UploadId']
                             , MultipartUpload=part_info)

Question: Does anyone know how to use the multipart upload with boto3?

like image 357
blehman Avatar asked Dec 16 '15 04:12

blehman


2 Answers

Your code was already correct. Indeed, a minimal example of a multipart upload just looks like this:

import boto3
s3 = boto3.client('s3')
s3.upload_file('my_big_local_file.txt', 'some_bucket', 'some_key')

You don't need to explicitly ask for a multipart upload, or use any of the lower-level functions in boto3 that relate to multipart uploads. Just call upload_file, and boto3 will automatically use a multipart upload if your file size is above a certain threshold (which defaults to 8MB).

You seem to have been confused by the fact that the end result in S3 wasn't visibly made up of multiple parts:

Result: 5.9 gig file on s3. Doesn't seem to contain multiple parts.

... but this is the expected outcome. The whole point of the multipart upload API is to let you upload a single file over multiple HTTP requests and end up with a single object in S3.

like image 182
Mark Amery Avatar answered Sep 28 '22 17:09

Mark Amery


I would advise you to use boto3.s3.transfer for this purpose. Here is an example:

import boto3


def upload_file(filename):
    session = boto3.Session()
    s3_client = session.client("s3")

    try:
        print("Uploading file: {}".format(filename))

        tc = boto3.s3.transfer.TransferConfig()
        t = boto3.s3.transfer.S3Transfer(client=s3_client, config=tc)

        t.upload_file(filename, "my-bucket-name", "name-in-s3.dat")

    except Exception as e:
        print("Error uploading: {}".format(e))
like image 29
deadcode Avatar answered Sep 28 '22 18:09

deadcode