Downloading a file from an s3 Bucket to the USERS computer

Goal

Download file from s3 Bucket to users computer.

Context

I am working on a Python/Flask API for a React app. When the user clicks the Download button on the Front-End, I want to download the appropriate file to their machine.

What I've tried

import boto3 s3 = boto3.resource('s3') s3.Bucket('mybucket').download_file('hello.txt', '/tmp/hello.txt')

I am currently using some code that finds the path of the downloads folder and then plugging that path into download_file() as the second parameter, along with the file on the bucket that they are trying to download.

This worked locally, and tests ran fine, but I run into a problem once it is deployed. The code will find the downloads path of the SERVER, and download the file there.

Question

What is the best way to approach this? I have researched and cannot find a good solution for being able to download a file from the s3 bucket to the users downloads folder. Any help/advice is greatly appreciated.

310

asked Apr 04 '17 19:04

Aric Liesenfelt

2 Answers

You should not need to save the file to the server. You can just download the file into memory, and then build a Response object containing the file.

from flask import Flask, Response
from boto3 import client

app = Flask(__name__)


def get_client():
    return client(
        's3',
        'us-east-1',
        aws_access_key_id='id',
        aws_secret_access_key='key'
    )


@app.route('/blah', methods=['GET'])
def index():
    s3 = get_client()
    file = s3.get_object(Bucket='blah-test1', Key='blah.txt')
    return Response(
        file['Body'].read(),
        mimetype='text/plain',
        headers={"Content-Disposition": "attachment;filename=test.txt"}
    )


app.run(debug=True, port=8800)

This is ok for small files, there won't be any meaningful wait time for the user. However with larger files, this well affect UX. The file will need to be completely downloaded to the server, then download to the user. So to fix this issue, use the Range keyword argument of the get_object method:

from flask import Flask, Response
from boto3 import client

app = Flask(__name__)


def get_client():
    return client(
        's3',
        'us-east-1',
        aws_access_key_id='id',
        aws_secret_access_key='key'
    )


def get_total_bytes(s3):
    result = s3.list_objects(Bucket='blah-test1')
    for item in result['Contents']:
        if item['Key'] == 'blah.txt':
            return item['Size']


def get_object(s3, total_bytes):
    if total_bytes > 1000000:
        return get_object_range(s3, total_bytes)
    return s3.get_object(Bucket='blah-test1', Key='blah.txt')['Body'].read()


def get_object_range(s3, total_bytes):
    offset = 0
    while total_bytes > 0:
        end = offset + 999999 if total_bytes > 1000000 else ""
        total_bytes -= 1000000
        byte_range = 'bytes={offset}-{end}'.format(offset=offset, end=end)
        offset = end + 1 if not isinstance(end, str) else None
        yield s3.get_object(Bucket='blah-test1', Key='blah.txt', Range=byte_range)['Body'].read()


@app.route('/blah', methods=['GET'])
def index():
    s3 = get_client()
    total_bytes = get_total_bytes(s3)

    return Response(
        get_object(s3, total_bytes),
        mimetype='text/plain',
        headers={"Content-Disposition": "attachment;filename=test.txt"}
    )


app.run(debug=True, port=8800)

This will download the file in 1MB chunks and send them to the user as they are downloaded. Both of these have been tested with a 40MB .txt file.

174

answered Sep 21 '22 08:09

Allie Fitter

A better way to solve this problem is to create presigned url. This gives you a temporary URL that's valid up to a certain amount of time. It also removes your flask server as a proxy between the AWS s3 bucket which reduces download time for the user.

def get_attachment_url():
   bucket = 'BUCKET_NAME'
   key = 'FILE_KEY'

   client: boto3.s3 = boto3.client(
     's3',
     aws_access_key_id=YOUR_AWS_ACCESS_KEY,
     aws_secret_access_key=YOUR_AWS_SECRET_KEY
   )

   return client.generate_presigned_url('get_object',
                                     Params={'Bucket': bucket, 'Key': key},
                                     ExpiresIn=60) `

answered Sep 22 '22 08:09

Rajat Soni

Related questions
                            
                                Get max value index for a list of dicts
                            
                                Find end nodes (leaf nodes) in radial (tree) networkx graph
                            
                                Passing Pk or Slug to Generic DetailView in Django?
                            
                                Python Decorator for printing every line executed by a function
                            
                                Google Drive API Client (Python): Insufficient Permission for files().insert()
                            
                                (python) plot 3d surface with colormap as 4th dimension, function of x,y,z
                            
                                How to Change image captured date in python?
                            
                                Error in function to return 3 largest values from a list of numbers
                            
                                render html strings in flask templates
                            
                                Duplicate column name
                            
                                How to incrementally write into a json file
                            
                                xgboost installation issue with anaconda
                            
                                Cloud Pub/Sub Demo : 403 User not authorized to perform this action. when try to push notification
                            
                                Can anyone tell my why I'm getting the error [AttributeError: 'list' object has no attribute 'encode']
                            
                                PEP8 breaking long string in assert [duplicate]
                            
                                Pandas swap columns based on condition
                            
                                Error with OMP_NUM_THREADS when using dask distributed
                            
                                pyQt5 change MainWindow Flags
                            
                                Validate an ISO-8601 datetime string in Python?
                            
                                Why do I have to import this from numpy if I am just referencing it from the numpy module

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Downloading a file from an s3 Bucket to the USERS computer

Tags:

python

operating-system

amazon-s3

api

web-development-server