If I have existing files on Amazon's S3, what's the easiest way to get their md5sum without having to download the files?
Open a terminal window. Type the following command: md5sum [type file name with extension here] [path of the file] -- NOTE: You can also drag the file to the terminal window instead of typing the full path. Hit the Enter key. You'll see the MD5 sum of the file.
In order to make sure that the object is transmitted back-and-forth properly, S3 uses checksums, basically a kind of digital fingerprint. S3's PutObject function already allows you to pass the MD5 checksum of the object, and only accepts the operation if the value that you supply matches the one computed by S3.
Amazon S3 uses checksum values to verify the integrity of data that you upload to or download from Amazon S3. In addition, you can request that another checksum value be calculated for any object that you store in Amazon S3.
AWS's documentation of ETag
says:
The entity tag is a hash of the object. The ETag reflects changes only to the contents of an object, not its metadata. The ETag may or may not be an MD5 digest of the object data. Whether or not it is depends on how the object was created and how it is encrypted as described below:
- Objects created by the PUT Object, POST Object, or Copy operation, or through the AWS Management Console, and are encrypted by SSE-S3 or plaintext, have ETags that are an MD5 digest of their object data.
- Objects created by the PUT Object, POST Object, or Copy operation, or through the AWS Management Console, and are encrypted by SSE-C or SSE-KMS, have ETags that are not an MD5 digest of their object data.
- If an object is created by either the Multipart Upload or Part Copy operation, the ETag is not an MD5 digest, regardless of the method of encryption.
Reference: http://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html
ETag does not seem to be MD5 for multipart uploads (as per Gael Fraiteur's comment). In these cases it contains a suffix of minus and a number. However, even the bit before the minus does not seem to be the MD5, even though it is the same length as an MD5. Possibly the suffix is the number of parts uploaded?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With