I know that SHA-256 is favored over MD5 for security, etc., but, if I am to use a method to only check file integrity (that is, nothing to do with password encryption, etc.), is there any advantage of using SHA-256?
Since MD5 is 128-bit and SHA-256 is 256-bit (therefore twice as big)...
Would it take up to twice as long to encrypt?
Where time is not of essence, like in a backup program, and file integrity is all that is needed, would anyone argue against MD5 for a different algorithm, or even suggest a different technique?
Does using MD5 produce a checksum?
Predictably, these are also the hashing algorithms that are often used when generating digital signatures and authenticating digital records. The problem is that, while they are all often used to verify data integrity, only SHA-256 is still secure—MD5 and SHA-1 have known vulnerabilities.
The SHA-256 algorithm returns hash value of 256-bits, or 64 hexadecimal digits. While not quite perfect, current research indicates it is considerably more secure than either MD5 or SHA-1. Performance-wise, a SHA-256 hash is about 20-30% slower to calculate than either MD5 or SHA-1 hashes.
All in all, I'd say that MD5 in addition to the file name is absolutely safe. SHA-256 would just be slower and harder to handle because of its size. You could also use something less secure than MD5 without any problem. If nobody tries to hack your file integrity this is safe, too.
Although slower, SHA is more secure than MD5 due to a variety of reasons. First, it produces a larger digest, 160-bit compared to 128-bit, so a brute force attack would be much more difficult to carry out. Also, no known collisions have been found for SHA.
Both SHA256 and MDA5 are hashing algorithms. They take your input data, in this case your file, and output a 256/128-bit number. This number is a checksum. There is no encryption taking place because an infinite number of inputs can result in the same hash value, although in reality collisions are rare.
SHA256 takes somewhat more time to calculate than MD5, according to this answer.
Offhand, I'd say that MD5 would be probably be suitable for what you need.
Every answer seems to suggest that you need to use secure hashes to do the job but all of these are tuned to be slow to force a bruteforce attacker to have lots of computing power and depending on your needs this may not be the best solution.
There are algorithms specifically designed to hash files as fast as possible to check integrity and comparison (murmur
, XXhash
...). Obviously these are not designed for security as they don't meet the requirements of a secure hash algorithm (i.e. randomness) but have low collision rates for large messages. This features make them ideal if you are not looking for security but speed.
Examples of this algorithms and comparison can be found in this excellent answer: Which hashing algorithm is best for uniqueness and speed?.
As an example, we at our Q&A site use murmur3
to hash the images uploaded by the users so we only store them once even if users upload the same image in several answers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With