How to download files from s3 given the file path using boto3 in python

Tags:

Pretty basic but I am not able to download files given s3 path.

for eg, I have this s3://name1/name2/file_name.txt

import boto3
locations = ['s3://name1/name2/file_name.txt']
s3_client = boto3.client('s3')
bucket = 'name1'
prefix = 'name2'

for file in locations:
    s3_client.download_file(bucket, 'file_name.txt', 'my_local_folder')

I am getting error as botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found

This file exists as when I download. using aws cli as s3 path: s3://name1/name2/file_name.txt .

640

asked Apr 14 '18 07:04

Atihska

2 Answers

You need to have a list of filename paths, then modify your code like shown in the documentation:

import os
import boto3
import botocore

files = ['name2/file_name.txt']

bucket = 'name1'

s3 = boto3.resource('s3')

for file in files:
   try:
       s3.Bucket(bucket).download_file(file, os.path.basename(file))
   except botocore.exceptions.ClientError as e:
       if e.response['Error']['Code'] == "404":
           print("The object does not exist.")
       else:
           raise

answered Nov 06 '22 06:11

Burhan Khalid

You may need to do this with some type of authentication. There are several methods, but creating a session is simple and fast:

from boto3.session import Session

bucket_name = 'your_bucket_name'
folder_prefix = 'your/path/to/download/files'
credentials = 'credentials.txt'

with open(credentials, 'r', encoding='utf-8') as f:
    line = f.readline().strip()
    access_key = line.split(':')[0]
    secret_key = line.split(':')[1]

session = Session(
    aws_access_key_id=access_key,
    aws_secret_access_key=secret_key
)

s3 = session.resource('s3')
bucket = s3.Bucket(bucket_name)

for s3_file in bucket.objects.filter(Prefix=folder_prefix):
    file_object = s3_file.key
    file_name = str(file_object.split('/')[-1])
    print('Downloading file {} ...'.format(file_object))
    bucket.download_file(file_object, '/tmp/{}'.format(file_name))

In credentials.txt file you must add a single line where you concatenate the access key id and the secret, for example:

~$ cat credentials.txt
AKIAIO5FODNN7EXAMPLE:ABCDEF+c2L7yXeGvUyrPgYsDnWRRC1AYEXAMPLE

Don't forget to protect this file well on your host, give read-only permissions for the user who runs this program. I hope it works for you, it works perfectly for me.

answered Nov 06 '22 06:11

JavDomGom

Related questions
                            
                                Add a validator to a Mongodb collection with pymongo
                            
                                Merge rows within a group together
                            
                                Convert string to float pandas
                            
                                Correlation between two non-numeric columns in a Pandas DataFrame
                            
                                How to flatten an xarray dataset into a 1D numpy array?
                            
                                insert missing category for each group in pandas dataframe
                            
                                How to pass the parameter to User-Defined Function?
                            
                                Add a vertical label to matplotlib colormap legend
                            
                                Bash Script to Conda Install requirements.txt with PIP follow-up
                            
                                Django restrict data that can be given to model field
                            
                                Use both sample_weight and class_weight simultaneously
                            
                                Convert strings to float in all pandas columns, where this is possible
                            
                                Iterate Over Dictionary
                            
                                How to use ridge detection filter in opencv
                            
                                Python: Why return-type of itemgetter is not consistent
                            
                                how to print a tuple of tuples without brackets
                            
                                What Type should the dense vector be, when using UDF function in Pyspark? [duplicate]
                            
                                Do I need to import submodules directly?
                            
                                Display matplotlib graph in browser
                            
                                What's the purpose of giving an alias to an builtin function in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to download files from s3 given the file path using boto3 in python

Tags:

python

amazon-s3

boto3

Atihska

People also ask

2 Answers

Burhan Khalid

JavDomGom

Recent Activity

Donate For Us