Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Uploading files to S3 using Python

I have a list of file URLs which are download links. I have written Python code to download the files to my computer. Here's the problem, there are about 500 files in the list and Chrome becomes unresponsive after downloading about 50 of these files. My initial goal was to upload all the files that I have downloaded to a Bucket in s3. Is there a way to make the files go to S3 directly? Here is what I have written so far:

import requests
from itertools import chain
import webbrowser

url = "<my_url>"
username = "<my_username>"
password = "<my_password>"
headers = {"Content-Type":"application/xml","Accept":"*/*"}

response = requests.get(url, auth=(username, password), headers = headers)
if response.status_code != 200:
    print('Status:', response.status_code, 'Headers:', response.headers, 'Error Response:', response.json())
    exit()

data = response.json()
values = list(chain.from_iterable(data.values()))
links = [lis['download_link'] for lis in values]
for item in links:
    webbrowser.open(item)
like image 569
alapalak Avatar asked Jun 30 '26 17:06

alapalak


2 Answers

Its quite simple using python3 and boto3 (AWS SDK), eg.:

import boto3

s3 = boto3.client('s3')
with open('filename.txt', 'rb') as data:
    s3.upload_fileobj(data, 'bucketname', 'filenameintos3.txt')

for more information you can read boto3 documentation here: http://boto3.readthedocs.io/en/latest/guide/s3-example-creating-buckets.html

Enjoy

like image 151
Paulo Victor Avatar answered Jul 03 '26 08:07

Paulo Victor


If you have the aws cli installed on your system you can make use of subprocess library. For example:

import subprocess
def copy_file_to_s3(source: str, target: str, bucket: str):
   subprocess.run(["aws", "s3" , "cp", source, f"s3://{bucket}/{target}"])

Similarly you can use that logics for all sort of AWS client operations like downloading or listing files etc. This way there is no need to import Boto3. I guess its use is not intended that way but in practice I find it quite convenient that way. This way you also get the status of the upload displayed in your console - for example:

Completed 3.5 GiB/3.5 GiB (242.8 MiB/s) with 1 file(s) remaining

To modify the method to your wishes I recommend having a look into the subprocess reference as well as to the AWS Cli reference.

like image 22
Jojo Avatar answered Jul 03 '26 06:07

Jojo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!