Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if local file is same as S3 object without downloading it with boto3?

How to check if local file is same as file stored in S3 without downloading it? To avoid downloading large files again and again. S3 objects have e-tags, but they are difficult to compute if file was uploaded in parts and solution from this question doesn't seem to work. Is there some easier way avoid unnecessary downloads?

like image 572
DikobrAz Avatar asked Jun 13 '17 21:06

DikobrAz


People also ask

Can we read file from S3 without downloading?

Reading objects without downloading them Similarly, if you want to upload and read small pieces of textual data such as quotes, tweets, or news articles, you can do that using the S3 resource method put(), as demonstrated in the example below (Gist).


2 Answers

I would just compare the last modified time and download if they are different. Additionally you can also compare the size before downloading. Given a bucket, key and a local file fname:

import boto3
import os.path

def isModified(bucket, key, fname):
  s3 = boto3.resource('s3')
  obj = s3.Object(bucket, key)
  return int(obj.last_modified.strftime('%s')) != int(os.path.getmtime(fname))
like image 171
helloV Avatar answered Sep 29 '22 17:09

helloV


Can you use a small local database, e.g. a text file?

  • Download an S3 object once. Not its ETag.
  • Compute whatever signature you want.
  • Put the (ETag, signature) pair into the 'database'.

Next time, before you proceed with downloading, look up the ETag in the 'database'. If it's there, compute the signature of your existing file, and compare with the signature corresponding to the ETag. If they match, the remote file is the same that you have.

There's a possibility that the same file will be re-uploaded with different chunking, thus changing the ETag. Unless this is very probable, you can just ignore the false negative and re-download the file in that rare case.

like image 38
9000 Avatar answered Sep 29 '22 17:09

9000