Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the md5sum of a file on Amazon's S3

Tags:

amazon-s3

If I have existing files on Amazon's S3, what's the easiest way to get their md5sum without having to download the files?

like image 263
Switch Avatar asked Nov 21 '09 15:11

Switch


People also ask

How do I find the md5sum of a file?

Open a terminal window. Type the following command: md5sum [type file name with extension here] [path of the file] -- NOTE: You can also drag the file to the terminal window instead of typing the full path. Hit the Enter key. You'll see the MD5 sum of the file.

Does S3 store MD5?

In order to make sure that the object is transmitted back-and-forth properly, S3 uses checksums, basically a kind of digital fingerprint. S3's PutObject function already allows you to pass the MD5 checksum of the object, and only accepts the operation if the value that you supply matches the one computed by S3.

What is checksum in S3 bucket?

Amazon S3 uses checksum values to verify the integrity of data that you upload to or download from Amazon S3. In addition, you can request that another checksum value be calculated for any object that you store in Amazon S3.


2 Answers

AWS's documentation of ETag says:

The entity tag is a hash of the object. The ETag reflects changes only to the contents of an object, not its metadata. The ETag may or may not be an MD5 digest of the object data. Whether or not it is depends on how the object was created and how it is encrypted as described below:

  • Objects created by the PUT Object, POST Object, or Copy operation, or through the AWS Management Console, and are encrypted by SSE-S3 or plaintext, have ETags that are an MD5 digest of their object data.
  • Objects created by the PUT Object, POST Object, or Copy operation, or through the AWS Management Console, and are encrypted by SSE-C or SSE-KMS, have ETags that are not an MD5 digest of their object data.
  • If an object is created by either the Multipart Upload or Part Copy operation, the ETag is not an MD5 digest, regardless of the method of encryption.

Reference: http://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html

like image 51
Dennis Avatar answered Sep 22 '22 21:09

Dennis


ETag does not seem to be MD5 for multipart uploads (as per Gael Fraiteur's comment). In these cases it contains a suffix of minus and a number. However, even the bit before the minus does not seem to be the MD5, even though it is the same length as an MD5. Possibly the suffix is the number of parts uploaded?

like image 36
Duncan Harris Avatar answered Sep 22 '22 21:09

Duncan Harris